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(54) Hepatitis C Virus Epitopes 

(57) Peptide antigens which are immunoreactive 
with sera from individuals infected with hepatitis C virus 
(HCV) are disclosed. Several of the antigens are immu- 
nologically reactive with antibodies present in individu- 
als identified as having chronic and acute HCV 
infection. The antigens are useful in diagnostic methods 
for detecting HCV infection in humans. Also disclosed 
are corresponding genomic-fragment clones containing 
polynucleotides encoding the open reading frame 
sequences for the antigenic peptides. 
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Description 

1. Field of Invention 

5 [0001] This invention relates to specific peptide viral antigens which are immunoreactive with sera from patients 
infected with parenterally transmitted non-A, non-B hepatitis virus (PT-NANBH, now called Hepatitis C Virus), to poly- 
nucleotide sequences which encode the peptides, to an expression system capable of producing the peptides, and to 
methods of using the peptides for detecting PT-NANBH infection in human sera. 
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3. Background 

45 

[0003] Viral hepatitis resulting from a virus other than hepatitis A virus (HAV) and hepatitis B virus (HBV) has been 
referred to as non-A, non-B hepatitis (NANBH). More recently, it has become clear that NANBH encompasses at least 
two, and perhaps more, quite distinct viruses. One of these, known as enterically transmitted NANBH or ET-NANBH, is 
contracted predominantly in poor-sanitation areas where food and drinking water have been contaminated by fecal mat- 
so ter. The molecular cloning of a portion of this virus, referred to as the hepatitis E virus (HEV), has recently been 
described (Reyes et al.). 

[0004] The second NANB virus type, known as parenterally transmitted NANBH, or PT-NANBH, is transmitted by 
parenteral routes, typically by exposure to blood or blood products. Approximately 10% of transfusions cause PT- 
NANBH infection, and about half of these go on to a chronic disease state (Dienstag). 
55 [0005] Human sera documented as having produced post-transfusion NANBH in human recipients has been used 
successfully to produce PT-NANBH infection in chimpanzees (Bradley). RNA isolated from infected chimpanzee sera 
has been used to construct cDNA libraries in an expression vector for immunoscreening with chronic-state human PT- 
NANBH serum. This procedure identified a PT-NANBH specific cDNA clone and the viral sequence was then used as 
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a probe to identify fragments making up 7,300 contiguous basepairs of a PT-NANBH viral agent (EPO patent applica- 
tion 88310922.5, filed 11/18/88). The same procedure was used by the present inventors to derive two of the PT- 
NANBH peptide and polynucleotide sequences disclosed herein. The sequenced viral agent has been named HCV 
(HCV) (above EPO patent application). 
5 [0006] Heretofore, one immunogenic peptide encoded by the HCV viral agent has been reported (Choo, Kuo, EPO 
application 88310922.5). This peptide, designated (MOO. has been used in immunoassays of PT-NANBH sera and 
found to react immuno-specifically with up to 80% of chronic NANBH samples, and about 15% of acute NANBH sam- 
ples (Kuo). 

[0007] It is desirable to provide one or a collection of peptide antigens which are immunoreactive with a greater per- 
io centage of PT-NANBH-infected blood, including both acute and chronic PT-NANBH infection. 

4. Summary of the Invention 

[0008] It is one general object of the invention to provide recombinant polypeptides immunoreactive with sera from 
75 humans infected with hepatitis C virus (HCV), including a peptide which is immunoreactive with a high percentage of 
sera from chronic HCV-infected individuals, and peptides which are immunoreactive with sera associated with acute 
HCV infection. 

[0009] It is another object of the invention to provide an HCV polynucleotide sequence encoding a sequence for 
recombinant production of the peptide antigens, and a diagnostic method for detecting HCV-infected human sera using 
20 the peptide antigens. 

[0010] The invention includes, in one aspect, a peptide antigen which is immunoreactive with sera from humans 
infected with HCV. One peptide antigen in the invention includes an immunoreactive portion of an HCV polypeptide 
which: 

25 a) is encoded by an HCV coding sequence; 

b) has 504 amino acid residues; and 

c) has the carboxy-terminal sequence presented as SEQ ID NO:4. 

[0011] Other peptide antigens of the invention include an immunoreactive portion of any one of the following 
30 sequences: SEQ ID NO:2, SEQ ID NO:8, SEQ ID NO:10. SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID 
NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, and SEQ ID NO:26.. 

[0012] In another aspect, the invention includes diagnostic kits for use in screening human blood containing anti- 
bodies specific against HCV infection. The kit includes at least one peptide antigen which is immunoreactive with sera 
from humans infected with hepatitis C virus (HCV): specific peptide antigens for use in the kit are given above. 
35 [0013] One preferred embodiment of the present invention is a diagnostic kit containing the 409-1 -1(c-a) (SEQ ID 
NO:8) and one of the HCV-capsid derived proteins (SEQ ID NOs:12, 14, 16, 18, 20, 22, 24, and 26): two particular 
embodiments being 409-1 -1(c-a) with the C1NC450 capsid-derived peptide, and 409-1 -1(c-a) with the C1NC360 cap- 
sid-derived peptide. 

[0014] In one embodiment of the present invention, the antigen is immobilized on a solid support. The binding of 
40 HCV-specrf ic antibodies to the immobilized antigen is detected by a reporter-labeled anti-human antibody which acts to 
label the solid support with a detectable reporter. 

[0015] The kit is used in a method for detecting HCV infection in an individual by: 0) reacting serum from an HCV- 
infected test individual with the above peptide antigen, and (ii) examining the antigen for the presence of bound anti- 
body. 

45 [001 6] The peptide antigens are produced, in accordance with another aspect of the invention, using an expression 
system for expressing a recombinant peptide antigen which is immunoreactive with sera from humans infected with 
hepatitis C virus (HCV). A selected expression vector containing an open reading frame (ORF) of a polynucleotide 
which encodes the peptide is introduced into a suitable host, which is cultured under conditions which promote expres- 
sion of the ORF in the expression vector. 

so [0017] In one embodiment, the polynucleotide is inserted into an expression site in a lambda gt1 1 phage vector, 
and the vector is introduced into an E. co/i host. The following E. coli hosts have been deposited which contain vectors 
including the coding sequences of the antigens shown in parenthesis: ATCC No 40901 (SEQ ID NO:3). ATCC NO. 
40893 (SEQ ID NO:1), and ATCC No. 40792 (SEQ ID NO:7) ( and ATCC No. 40876 (SEQ ID NO:9). pGEX and pETare 
two other vectors which have been used to express HCV antigens. It will be appreciated that determination of other 

55 appropriate vector and host combinations for the expression of the above sequences are within the ability of one of ordi- 
nary skill in the art. 

[001 8] Also forming part of the invention are polynucleotides which encode polypeptides immunoreactive with sera 
from humans infected with hepatitis C virus (HCV). One polynucleotide of the present invention encodes a polypeptide 
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wherein the polypeptide includes an immunoreactive portion of a peptide sequence which: 

a) is encoded by an HCV coding sequence; 

b) has 504 amino acid residues; and 

5 c) has the carboxy-terminal sequence presented as SEQ ID NO:4; and, where the carboxy-terminal amino acid 

sequence of said peptide antigen is encoded by the polynucleotide sequence presented as SEQ ID NO:3. 

[0019] Other polynucleotides of the invention include any one of the following sequences: SEQ ID NO:1, SEQ ID 
NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13. SEQ ID NO:15, SEQ ID NO:17, SEQ 
io ID NO:19, SEQ ID NO:21 , SEQ ID NO:23. and SEQ ID NO:25. 

[0020] These and other objects and features of the invention will become more fully apparent when the following 
detailed description is read in conjunction with the accompanying drawings. 

5. Brief Description of the Drawing s 

15 

[0021] 

Figure 1 illustrates the steps in producing overlapping linking fragments of a nucleic acid segment in accordance 
with the methods of the present invention; 
20 Figure 2 shows the positions of overlap primer regions and linking regions along a 7,300 basepair portion of the 
HCV genome. 

Figure 3 shows the DNA coding sequence of the clone 40 insert. The underlined sequences correspond to an R 9 
primer region. 

Figure 4 shows the DNA coding sequence of a clone 36 insert. The underlined sequences correspond, respec- 
25 tively, to the F 7 , F 8 , and R 8 primer regions. 

Figure 5 shows the DNA and protein coding sequences for a 409-1 -1(abc) clone insert. The "A" region of this 
sequence is delineated by boxes, the W B" region by a box and a triangle, and the M C" region by a triangle and an 
asterisk. 

Figure 6 shows the DNA and protein coding sequences for a 409-1 -1(c-a) clone insert. 
30 Figure 7 illustrates the groups of clones which have been obtained from the HCV genome in the region correspond- 
ing to the 409-1-1 (abc) clone insert. 

Figure 8A shows the DNA and protein coding sequences for the pGEX-GG1 insert. The three G's above the first 
line indicate where substitutions were made to generate the clone pGEX-CapA. Figure 8B shows the DNA and pro- 
tein sequences for the pGEX-CapA insert coding sequence. The primers used in polymerase chain reactions to 

35 generate carboxy and amino terminal deletions are indicated below the nucleotide line. The sequences of the prim- 
ers are indicated in the sense (coding strand). The actual sequence of the NC (non-coding) primers is the reverse 
complement of the indicated sequence. Coding primers are underlined : reverse (noncoding) primers are double- 
underlined . Sequences shown in capital letters are exact matches. Sequences in lowercase letters are "mis- 
matched" sequences used to introduce the terminal restriction sites (Ncof at the 5' ends and BamHI at the 3' ends). 

40 The three nucleotides which have been altered to remove the "slippery codons ,t at positions 24, 27, and 30 are indi- 
cated by bold type with the wild type A residues shown above the sequence. 

Figure 9 shows a hydropathicity plot of the HCV-core protein encoded by pGEX-CapA. The relative location of the 
primers, used to generate carboxy and amino terminal deletions, are indicated relative to the protein coding 
sequence by arrows. 
45 Figure 10 shows an epitope map of the HCV capsid protein region. 

6. Detailed Description of t he Invention 
I. Definitions 

50 

[0022] The terms defined below have the following meaning herein: 

1. "Parenterally transmitted non-A, non-B hepatitis viral agent (PT-NANBH)" means a virus, virus type, or virus 
class which (i) causes parenterally transmitted infectious hepatitis, (ii) is transmissible in chimpanzees, (in) is sero- 

55 logically distinct from hepatitis A virus (HAV), hepatitis B virus (HBV), and hepatitis E virus (HEV). 

2. "HCV (HCV)" means a PT-NANBH viral agent whose polynucleotide sequence includes the sequence of the 
7,300 basepair region of HCV given in the Appendix, and variations of the sequence, such as degenerate codons, 
or variations which may be present in different isolates or strains of HCV. 
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3. Two nucleic acid fragments are "homologous" if they are capable of hybridizing to one another under hybridiza- 
tion conditions described in Maniatis el aL op. cit.. pp. 320-323, using the following wash conditions: 2 x SCC, 0. 1% 
SDS, room temperature twice, 30 minutes each; then 2 x SCC, 0.1% SDS, 50°C once, 30 minutes; then 2 x SCC, 
room temperature twice, 10 minutes each, homologous sequences can be identified that contain at most about 25- 

5 30% basepair mismatches. More preferably, homologous nucleic acid strands contain 15-25% basepair mis- 

matches, even more preferably 5 : 1 5% basepair mismatches. These degrees of homology can be selected by using 
more stringent wash or hybridization conditions for identification of clones from gene libraries (or other sources of 
genetic material), as is well known in the art. 

4. A DNA fragment is "derived from" HCV if it has substantially the same basepair sequence as a region of the HCV 
io viral genome which was defined in (2) above. 

5. A protein is "derived from" a PT-NANBH or HCV viral agent if it is encoded by an open reading frame of a cDNA 
or RNA fragment derived from a PT-NANBH or HCV viral agent, respectively 

II. Molecular Clone Selection by Immunoscreening 

15 

[0023] As one approach toward identifying a molecular clone of a PT-NANBH agent, cDNA libraries are prepared 
from infected sera in the expression vector lambda gt1 1 . cDNA sequences are then selected for expression of peptides 
which are immunoreactive with PT-NANBH-infected sera. Recombinant proteins identified by this approach provide 
candidates for peptides which can serve as substrates in diagnostic tests. Further, the nucleic acid coding sequences 
20 identified by this approach serve as useful hybridization probes for the identification of further PT-NANBH coding 
sequences. 

[0024] In order to make immunoscreening a useful approach for identifying clones originating from PT-NANBH cod- 
ing sequences, a well-defined source of PT-NANBH virus is important. To generate such a source, a chimpanzee (#771 ; 
Example 1 A) was infected with transmissible PT-NANBH agents using a Factor VIII concentrate as a source (Bradley). 
25 The Factor VIII concentrate was known to contain at least two forms of parenterally transmitted NANB hepatitis (PT- 
NANBH). In addition to a chloroform-sensitive agent, which has subsequently been called HCV (HCV), a chloroform- 
resistant form of PT-NANBH was also transmitted in the concentrate (Bradley, 1983): 

[0025] In the method illustrated in Example 1 , infected serum was pelleted, without dilution, by centrifugation, and 
cDNA libraries were generated tram the resulting pelleted virus (Example 1B and 1C). Sera from infected human 

30 sources were treated in the same fashion. cDNA libraries were generated, e.g., by a random primer method using the 
RNA extracted from pelleted sera as starting material (Example 1 B and 1C). The resulting cDNA molecules were then 
cloned into a suitable vector, for example, lambda gt1 1, for expression and screening of peptide antigens, and lambda 
gt10, for hybridization screening (Example 1C(iv)). Lambda gt1 1 is a particularly useful expression vector which con- " 
tains a unique EcoRI insertion site 53 base pairs upstream of the translation termination codon of the beta-galactosi- 

35 dase gene. Thus, an inserted sequence is expressed as a beta-galactosidase fusion protein which contains the N- 
terminal portion of the beta-galactosidase gene, the heterologous peptide, and optionally the C-terminal region of the 
beta-galactosidase peptide (the C-terminal portion being expressed when the heterologous peptide coding sequence 
does not contain a translation termination codon). This vector also produces a temperature-sensitive repressor (c!857) 
which causes viral lysogeny at permissive temperatures, e.g., 32°C, and leads to viral lysis at elevated temperatures, 

40 e.g., 42°C. Advantages of this vector include: (1) highly efficient recombinant clone generation, (2) ability to select lys- 
ogenized host cells on the basis of host-cell growth at permissive, but not non-permissive, temperatures, and (3) high 
levels of recombinant fusion protein production. Further, since phage containing a heterologous insert produces an 
inactive beta-galactosidase enzyme, phage with inserts are typically identified using a beta-galactosidase colored-sub- 
strate reaction. 

45 [0026] In the screening procedure reported in Examples 1-3. individual cDNA libraries were prepared from the 
serum of one PT-NANBH infeced chimpanzee (#771) and four PT-NANBH infected humans (designated EGM, BV, 
WEH, and AG). These five libraries were immunoscreened using PT-NANBH positive human or chimpanzee sera 
(Example 2): 1 1 1 lambda gt1 1 clones were identified which were immunoreactive with at least one of the sera. Of these 
1 1 1 clones, 93 were examined for insert hybridization with normal DNA. The inserts were radioactively labelled and 

so used as probes against Hindlll/EcoRI doubly-digested human peripheral lymphocyte (PBL) DNA (Example 3). Approx- 
imately 46% (43/93) of the inserts hybridized with normal human PBL DNA and were therefore not pursued. Inserts 
from 1 1 PT-N ANBH-immunopositive clones derived from chimpanzee #771 sera were characterized as exogenous to 
normal human PBL DNA (Example 3). Of these 1 1 clones 2 PT-NANBH clones were identified having the following 
characteristics. One clone (clone 40) was dearly exogenous by repeated hybridization tests against normal human PBL 

55 DNA, had a relatively small insert size (approximately 0.5 kilobases), and was quite unreactive with negative control 
serum. The second clone (clone 36) was shown to be reactive with multiple PT-NANBH antisera. had a relatively large 
insert size (approximately 1 .5 kilobases), and was exogenous by hybridization testing against normal human PBL DNA. 
The immunoreactive characteristics of clones 36 and 40 are summarized in Table 1 (Example 3). Clone 36 was immu- 



5 

9NSDOC!D:<S D 101P5FPA? t > 



EP1 018 558 A2 



noreactive with chimpanzee #771 sera and two HCV-positive human sera, AG and BV The clone 36 antigen did not 
immunoreact with the negative control sera SKR Clone 40 was immunoreactive with chimpanzee #771 sera and was 
cleanly nonreactive when the negative control sera was used for screening. 

[0027] The DNA sequence of clone 36 was determined in part and is shown in Figure 4. This sequence corre- 
sponds to nucleotides 5010 to 6516 of the HCV sequence given in the Appendix. The DNA sequence was also deter- 
mined for the clone 40 insert (Figure 3). This sequence is homologous to the HCV sequence (Appendix) in the region 
of approximately nucleotides 6515 to 7070. The inserts of two other chimpanzee #771 clones, clones 44 and 45, were 
found to be homologous to clone 40 by hybridization and sequence analysis (Example 4). The sequences for clones 36 
and 40 are contiguous sequences, with the clone 36 sequences being located 5' of the clone 40 sequences as pre- 
sented in the Appendix. Accordingly, these two clones represent isolation of a significant block of the HCV genome by 
the above-described immunoscreening methods. 

[0028] The four lambda gt1 1 clones 36. 40, 44, and 45 were deposited in the Genelabs Culture Collection, Gene- 
labs Incorporated, 505 Penobscot Drive, Redwood City, CA 94063. Further, the lambda gt1 1 clones of clones 36 and 
40 were deposited with the American Type Culture Collection, 12301 ParWawn Dr., Rockville MD, 20852, and given the 
deposit numbers ATCC No. 40901 and ATCC 40893. 

Ml. PT-NANBH Sequence Identification bv Hybridization Methods. 

[0029] The polynucleotides identified in Section II can be employed as probes in hybridization methods to identify 
further HCV sequences, and these can then be used as probes to identify additional sequences. The polynucleotides 
can be directly cloned or fragmented by partial digestion to generate random fragments. The resulting clones can be 
immunoscreened as described above to identify HCV antigen coding sequences. 

[0030] To illustrate how the inserts of clones 36 and 40 can be used to identify clones carrying HCV sequences, the 
insert of clone 40 was isolated and used as a hybridization probe against the individual cDNA libraries established in 
lambda gt10 (see above). Using the clone 40 probe approximately 24 independent hybridization-positive clones were 
plaque purified (Example 5). The positive signals arose with different frequencies in cDNA libraries from the different 
serum sources, suggesting that the hybridization signals were from the serum sources, rather than resulting from some 
common contaminant introduced during the cDNA synthesis or cloning (Table 2). One of the clones, 108-2-5, which 
tested positive by hybridization with the clone 40 insert, had an insert of approximately 3.7 kb (Example 6). Since it had 
such a large insert, clone 108-2-5 was chosen for further analysis. The serum source of this cDNA done was EGM 
human PT-NANBH serum (Example 1). 

[0031] The insert of 108-2-5 was isolated by EcoRI digestion of the lambda gt10 clone, electrophoretic fractiona- 
tion, and electroelution (Example 6). The isolated insert was treated with DNase I to generate random fragments 
(Example 6), and the resulting digest fragments were inserted into lambda gt1 1 phage vectors for immunoscreening. 
The lambda gt1 1 clones of the 108-2-5 fragments were immunoscreened (Example 6) using human (BV and normal) 
and chimpanzee #771 serum. Twelve positive clones were identified by first round immunoscreening with the human 
and chimp sera. Seven of the 12 clones were plaque purified and rescreened using chimpanzee #771 serum. Partial 
DNA sequences of the insert DNA were determined for two of the resulting clones, designated 328-16-1 and 328-16-2. 
These two clones contained sequences essentially identical to clone 40 (Example 6). 

[0032] The clone 36 insert can be used in a similar manner to probe the original cDNA library generated in lambda 
gt10. Specific subfragments of clone 36 may be isolated by Polymerase chain reaction or after cleavage with restriction 
endonucleases. These fragments can be radioactively labelled and used as probes against the cDNA libraries gener- 
ated in lambda gt10 (Example 1C). In particular, the 5* terminal sequences of the clone 36 insert are useful as probes 
to identify clones overlapping this region. 

[0033] Further, the sequences provided by the terminal clone 36 insert sequences and the terminal clone 40 insert 
sequences are useful as specific sequence primers in first-strand DNA synthesis reactions (Maniatis et al.; Scharf et 
al.) using, for example, chimpanzee #771 sera generated RNA as substrate. Synthesis of the second-strand of the 
cDNA is randomly primed. The above procedures identify or produce cDNA molecules corresponding to nucleic acid 
regions that are 5* adjacent to the known clone 36 and 40 insert sequences. These newly isolated sequences can in 
turn be used to identify further flanking sequences, and so on, to identify the sequences composing the HCV genome. 
As described above, after new HCV sequences are isolated, the polynucleotides can be cloned and immunoscreened 
to identify specific sequences encoding HCV antigens. IV. Generating Overlapping Cloned Unking Fragments 
[0034] This section describes a method for producing and identifying HCV peptides which may be useful as HCV- 
diagnostic antigens. The present method is used to generate a series of overlapping linking fragments which span a 
segment of nucleic acid. The application of the method to generating a series of overlapping linking fragments which 
span a 7,300 basepair segment of the HCV genome, whose sequence is given in the Appendix, will be described with 
reference to Figures 1 and 2. 

[0035] As a first step in the method, and with reference to Figure 1 1 the nucleic acid of interest is obtained in double- 
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strand DNA form. Typically, this is done by isolating genomic DNA fragments or by producing cDNAs from RNA species 
present in a sample fluid. The latter method is used to generate double-strand DNA from NANBH viral RNA present in 
serum from chimpanzees or humans with known PT-NANBH infection. Here RNA in the sample is isolated, e.g. ( by gua- 
nidinium thiocyanate extraction of PEG precipitated virions, and reacted with a suitable primer for first strand cDN A syn- 
5 thesis. 

[0036] First-strand cDNA priming may be by random primers, oligo dT primers, or sequence-specific primer(s). The 
primer conditions are selected to (a) optimize generation of cDNA fragments which collectively will span the nucleic acid 
segment of interest, and (b) produce cDNA fragments which are preferably equal to or greater than about 1 ,000 base- 
pairs in length. In one method applied to HCV RNA, the first-strand synthesis is carried out using sequence-specific 
io primers which are complementary to spaced regions along the length of the known HCV genomic sequence. The 
primer position are indicated at A, B, C, and D in Figure 2, which shows a map of the HCV genome segment. The base- 
pair locations of the primers in the HCV genome are given in Example 7 below. Following first strand synthesis, the sec- 
ond cDNA strand is synthesized by standard methods. 

[0037] The linking fragments in the method are produced by sequence-specific amplification of the double-strand 
75 DNA obtained as above, using pairs of overlap-region primers to be described. According to an important advantage of 
the methods of the present invention, it is possible to generate linking fragments even when the amount of doubte- 
strand DNA is too low for direct sequence-specific amplification. This limitation was found, for example, with HCV 
cDNA's produced from NANBH -infected serum. Here the amount of double-stranded DNA available for amplification is 
first amplified nonspecificallv by a technique known as Sequence-Independent Single-Primer Amplification (SISPA). 
20 [0038] The SISPA technique is detailed in co-owned U.S. Patent application for "RNA and DNA Amplification Tech- 
niques", Serial No. 224,961 , filed July 26, 1988. The method as applied to amplification of HCV cDNA fragments is also 
described in Example 7. Briefly, known-sequence linker primers are attached to opposite ends of double-stranded DNA 
in a DNA sample. These linkers then provide the common end sequences for primer-initiated amplification, using prim- 
ers complementary to the linker/primer sequences. Typically, the SISPA method is carried out for 20-30 cycles of ampli- 
25 fication, using thermal cycling to achieve successive denaturation and primer-initiated polymerization of second strand 
DNA. 

[0039] Figure 1 illustrates the SISPA amplification of duplex DNA, to form amplified fragments which have known- 
sequence regions Pj. As seen, the fragment mixture includes at least some fragments which (a) overlap at regions Pj - 
with other fragments in the mixture and (b) contain complete linking regions between adjacent Pj and P^ regions. Col- . 

30 lectively, each linking region bounded by the associated overlap regions making up the segment is present in at least 
one DNA fragment. ; 
[0040] The production of overlapping linking fragments, in accordance with the methods of the present inverrtion.Js 
carried out using the polymerase chain reaction (PCR) method described in U.S. Patent No. 4,683,195. In practicing 
this step of the method, first the total segment of interest is divided into a series of overlapping intervals bounded by 

35 regions of known sequence, as just described. In Figure 2, the 7,300 basepair segment of the HCV genome has been 
divided into 10 intervals, each about 500-1,000 basepairs in length. The intervals are designated according to the for- 
ward Fj and reverse Rj primers used in amplifying the sequence, as will be described. The selection of the intervals is 
guided by (a) the requirement that the basepair sequence at each end of the interval be known, and (b) a preferred 
interval length of between about 500 and 2,000 basepairs. 

40 [0041] In the method applied to the 7.300 basepair segment of the HCV genome, the regions of overlap between 
the ten intervals were additionally amplified, to verify that the SISPA-amplrfied cDNA sample contained sufficient HCV 
cDNA to observe PCR-amplification of HCV linking fragments, and that HCV regions along the entire length of the 
genome were available for amplification. Each overlap region in the segment can be defined by a pair of primers which 
includes a forward primer Fj and a reverse primer R s which are complementary to opposite strands of opposite ends of 

45 the overlap region. The primers are typically about 20 base-pairs in length and span an overlap region of about 200 
basepairs. The eleven overlap regions in the HCV segment and the regions corresponding to the forward and reverse 
primers in each region are given in Example 8. 

[0042] The primers F/Rj are added to the amplified DNA material in a PCR reaction mix, and the overlap region 
bounded by the primers is amplified by 20-30 thermal cycles. The reaction material is then fractionated, e.g., by agar- 

so ose gel electrophoresis, and probed for the presence of the desired sequence, e.g., by Southern blotting (Southern), 
using a radiolabeled oligonucleotide probe which is specific for an internal portion of the overlap region. As described 
in Example 8, this method was successful in producing amplified fragments for each of the eleven F/Rj overlap regions 
in the HCV genome segment. The overlap-region fragments may be used as probes for the corresponding (two) linking 
fragments connected by the overlap region, rt is emphasized, however, that this amplification step was employed to con- 

55 firm the presence of amplif table cDNA along the length of the HCV genome, and not as an essentia] step in producing 
the desired linking fragments. The step is omitted from Figure 1 . 

[0043] The linking fragments F/Rj are produced by a two-primer PCR procedure in which the SISPA-amplif ied DNA 
fragments are amplified by a primer pair consisting of the forward primer F t of one overlap region and the reverse primer 
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Rj of an adjacent overlap region. Trie ten overlap regions in the HCV segment and the regions corresponding to the for- 
ward and reverse primers in each region are given in Example 9. Typical amplification conditions are give in Example 
9. The amplified fragments in each reaction mixture are isolated and purified, e.g., by gel electrophoresis, to confirm the 
expected fragment size. Southern blots may be probed with oligonucleotide probes complementary to internal regions 
located between the fragment ends, to confirm the expected sequence of the fragments. As shown at the bottom in Fig- 
ure 1 , the method generates the complete set of linking fragments, where each fragment is bounded by an overlap 
region Pj and P j+1 . 

[0044] The method, as applied to generating ten overlapping linking fragments of the 7,300 basepair HCV genome, 
is described in Example 9. As demonstrated by size criteria on gel electrophoresis and by sequence criteria by South- 
ern blotting, the method was successful in generating all ten of the overlapping fragments spanning the HCV genome. 
[0045] It will be appreciated that the above flanking sequence amplification method can be applied to the genera- 
tion of DNA fragments corresponding to the insert sequences of clones 36 and 40, which have also been obtained by 
immunoscreening. The linker primers flanking the inserts are easily used to generate sequences corresponding to the 
clone inserts. For example, two-primer amplification of the SISPA-amplified cDNA fragments (Example 7) using the 
F 12 /Ft9 primer pair (the sequences of which are given in Example 8) is carried out under conditions similar to those 
described in Example 9. The amplified fragment mixture is fractionated by agarose electrophoresis on 1.0 % agarose, 
and the expected band cut from the gel and eluted. 

[0046] The purified amplified fragment is treated with the Klenow fragment of DNA polymerase I to assure the mol- 
ecules are blunt-ended. The fragment is then ligated to EcoRI linkers (Example 1 0). The mixture is digested with EcoRI 
and inserted into the lambda gt1 1 vector. The resulting clones contain the entire coding sequences of either the clone 
36 or clone 40 inserts. 

[0047] Alternatively, the original amplified 36/40 fragment (primers F 12 /R 9 ) .is briefly treated with Exonuclease III 
(Boehringer Mannheim, as per manufacturer's instructions) to generate a family of fragments with different 5' ends. The 
digestion products are treated as above and ligated into the lambda gt1 1 vector. The resulting plaques are then imrnu- 
noscreened. 

[0048] Further, different sets of primers, other than the F 12 /R 9 primers described above, can be used to directly 
generate sequence encoding all, or portions, of clones 36 and 40. For example, primers Fq/R 9 can generate a fragment 
corresponding to a portion of the 3' sequences of the insert of clone 36 (Figure 4) and all of the insert sequences of 
clone 40 (Figure 3). Also, primers F 7 /R 8 can be used to directly generate a fragment corresponding to a portion of the 
5' sequences present in the insert of clone 36 (Figure 4). 

V. PT-NANBH Immunoreactive Peptide Fragments 

[0049] Several novel peptide antigens which are immunoreactive with sera from human and chimpanzee NANBH- 
infected sera have been generated from the NANBH linking fragments produced above, in accordance with the meth- 
ods of the present invention. Further, this method has confirmed antigenic regions previously identified by cDNA library 
immunoscreening (Section II above). The antigen peptides derived from linking fragments are preferably produced in a 
method which involves first digesting each of the above linking fragments with DNasel under partial digestion condi- 
tions, yielding DNA digest fragments predominantly in the 100-300 basepair size range, as illustrated in Example 10. 
The digest fragments may be size fractionated, for example by gel electrophoresis, to select those in the desired size 
range. 

[0050] The digest fragments from each linking fragment are then inserted into a suitable expression vector. One 
exemplary expression vector is lambda gt1 1 , the advantages of which have been described above. 
[0051 ] For insertion into the expression vector, the digest fragments may be modified, if needed, to contain selected 
restriction-site linkers, such as EcoRI linkers, according to conventional procedures. Typically, the digest fragments are 
blunt-ended, ligated with EcoRI linkers, and introduced into EcoRI-cut lambda gt1 1. Such recombinant techniques are 
well known in the art (e.g., Maniatis et al.). 

[0052] The resulting viral genomic library may be checked to confirm that a relatively large (representative) library 
has been produced for each linking fragment. This can be done, in the case of the lambda gt1 1 vector, by infecting a 
suitable bacterial host, plating the bacteria, and examining the plaques for loss of beta-galactosidase activity, as evi- 
denced by clear plaques. 

[0053] The presence of a digest-fragment insert in the clear plaques can be confirmed by amplifying the phage 
DNA, using primers specific for the regions of the gt1 1 phage flanking the EcoRI insert site, as described in Example 
1 0B. The results in Table 3 show that a large percentage of the plaques tested in each linking fragment library contained 
a digest-fragment insert. 

[0054] The linWng-fragment libraries may also be screened for peptide antigens which are immunoreactive with 
human or chimpanzee sera identified with PT-NANBH chronic, convalescent, or acute infection. One preferred immu- 
noscreening method is described in Example 10B. Here recombinant protein produced by the phage-infected bacteria 
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is transferred from the plaques to the filter. After washing, the filter is incubated with test serum, and then reacted with 
reporter-labeled anti-human IgG antibody. The presence of the peptide antigen on the filter is then assayed for the pres- 
ence of the reporter. As seen from Table 3, several of the linking-fragment libraries were positive for immunoreactive 
peptides in the primary screen. 

5 [0055] The immunoscreening method just described can be used to identify library plaques from each of the linking 
libraries which are immunoreactive with sera from human or chimpanzee with known chronic* convalescent, or acute 
PT-NANBH infection. One exemplary screening procedure is given in Example 1 1 , where the ten HCV linking-fragment 
libraries are screened with known PT-NANBH (a) human chronic serum, (b) chimpanzee acute pooled sera and (c) 
chimpanzee chronic pooled sera. Of the ten libraries examined, only the F 1 /R 10 library did not give positive immunore- 

w action with any of the three sera. Several of the fragment libraries, including F^R^ Fg/R^, F 12 /R 7 , and F 7 /R 8 showed 
five or more positive reactions with chimpanzee acute sera, indicating that these libraries each express one or more 
peptide antigens which are useful for detecting chimapanzee or human acute PT-NANBH infection. 
[0056] The fragment library F 7 /R 8 corresponds to an internal fragment of clone 36 insert (Section II; Figure 4). 
Accordingly, the linking fragment method confirmed that this DNA region encodes a useful antigen. Further, the frag- 

75 ment library Fg/R 9 contains the sequences present in the clone 40 insert (Section II: Figures 3 and 4). The results in 
Table 4 indicate that at least one peptide antigen effective to detect the presence of chronic-infection serum was iso- 
lated from the Fq/Hq fragment library. 

VI. Immunoreactive 409-1-1 Peptides 
20 ■ ■ 

A. Immunoreactive Screening 

[0057] Two of the immunoreactive plaques identified by immunoreactive screening, designated 409-1 -1(abc) and 
409-1 -1 (c-a), were tested for immunoreactivity against well-documented PT-NANBH chronic sera which showed strong 

25 immunoreactivity to the 5-1 -1 HCV peptide antigen (Kuo). The 5-1 -1 HCV peptide antigen has previously been identi- 
fied as immunoreactive against a high percentage of human PT-NANBH chronic sera. The 5-1-1 antigen is encoded by 
the sequence between basepairs 3731 and 3857 in the HCV genome (Appendix) and is itself contained in a larger pep- 
tide antigen C-100 encoded by the sequence between basepairs 3531 and 4442. The latter peptide is employed in a 
commercial diagnostic kit for detection of human HCV infection (Ortho/Chiron). The kit is reported to react positively 

30 with about 80% of human chronic PT-NANBH samples, and about 15% of human acute PT-NANBH sera, as noted 
above. 

[0058] The 409-1-1 (c-a) phage was identified by immunoscreening and plaque purified, as outlined above. A. 
related clone, designated 409-1 -1(abc), was described in the parent to the present application (U. S. Application Ser. 
No. 07/505,61 1, herein incorporated by reference). Clone 409-1 -1(abc) was designated 409-1-1 in the parent applica- 
35 tion. The a, b and c designations refer to three regions of the 409-1 -1 (abc) sequence (see Figure 5). The 5-1-1 coding 
sequence was isolated by polymerase chain reaction using oligonucleotide primers complementary to the ends of the 
5-1-1 coding region, and cloned into lambda gt1 1 for expression under induction conditions of a fused beta-galactosi- 
dase protein which includes the 5-1-1 antigen peptide region. The 5-1-1 phage was identified and plaque purified by 
similar methods. 

40 [0059] The 409-1-1 (c-a) and 5-1-1 antigens were compared by plaque immunoscreening with a panel of 28 sera 
from normal (2 donors), human PT-NANBH-chronic (6 donors), chimpanzee normal (7 donors), chimpanzee PT- 
NANBH-acute (5 donors), and chimpanzee PT-NANBH-chronic (8 donors), with the results shown in Table 5 in Example 
12. As can be seen in Table 5, the 5-1-1 and 409-1-1 (c-a) peptides reacted with most of the human and chimpanzee 
chronic sera, although the 409-1-1 (c-a) peptide detected a higher percentage of human chronic sera samples (83% vs 

45 66%). The chronic human serum which was detected by the 409-1 -1(c-a) peptide, but not by 5-1 -1 was from a patient 
(BV) who died of fulminant NANBH infection. Because the 5-1-1 antigen is contained within the C-100 antigen in the 
commercially available kit format (Ortho/Chiron), it was of interest to determine whether the C-100 antigen gave a 
broader range of reactivity with the test sera. The results are shown at the right in Table 5 below. The only human 
NANBH serum that was tested was the above BV serum which was not detected by 5-1-1. This serum was also not 

so immunoreactive with the C-100 antigen (0/1). Nor was the C-100 antigen reactive with any of the five acute chimp sera 
which were tested (0/5). It is also noted that the 409-1 -1 (c-a) antigen is immunoreactive with 3 of the 5 acute chimpan- 
zee sera tested, compared with only 1 out of 5 for the 5-1 -1 antigen. The results indicate that the 409-1 -1 (c-a) antigen 
has broader immunospecificity with PT-NANBH sera, and thus would provide a superior diagnostic agent. The results 
obtained with 409-1-1 (c-a) are comparable to the results obtained using 409-1-1 (abc). 

55 [0060] It is noted here that the 409-1 -1 (abc) coding sequence is contained in the F4/R5 linking fragment and does 
not overlap the sequence of the C-100 (and 5-1-1) coding region which is in the F^g and F^Rg linking fragments. The 
relatively long coding sequence of the 409-1-1 (abc) peptide illustrates that larger size digest fragments (substantially 
greater than 300 basepairs) are generated in the partial digest step used in producing digest fragments for antigen 
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expression. 

[0061] The 409-1 -1(abc) peptide, which forms one aspect of the invention, has the amino acid sequence which is 
presented as SEQ ID NO:10. The DNA coding sequence corresponding to the insert in the 409-1-1 clone is given in 
Figure 5 and is presented as SEQ ID NO:9. 

[0062] The 409-1 -1(c-a) peptide, which forms another aspect of the invention, has the amino acid sequence pre- 
sented as SEQ ID NO:8. The DNA coding sequence corresponding to the insert in the 409-1-1 (c-a) clone is given in 
Figure 6 and is presented as SEQ ID NO:7. The relationship between the coding sequence of 409-1 -1(c-a) and 409-1- 
1(abc) is outlined in Example 12. Briefly, 409-1-1(c-a) consists of a carboxy terminal region of 409-1-1(abc) moved to 
the amino terminus of the 409-1-1 coding sequence, with a truncation of the remaining 3' 409-1 -1(abc) coding 
sequence. 

[0063] More generally, the invention includes a peptide antigen which is imrnunoreactive with sera from humans 
with HCV infection. Such peptide antigens are readily identifiable by the methods of the present invention. 
[0064] Antigens obtained from the region corresponding to the HCV sequences encoding the 409-1-1 antigens 
were further characterized as follows. The primers shown in Table 7 were used to generate a family of overlapping 
amplified fragments derived from this region. Several templates were used for the DNA amplification reactions (Table 
8). The relationships of the coding sequences of the resulting clones to each other are graphically illustrated in Figure 
7. The amplified fragments were then cloned into lambda gt1 1 vectors (Example 13). 

[0065] These cloned fragments were then immunoscreened (Example 13). Seven of the nine clones tested positive 
by preliminary immunoscreening (Table 9). These seven clones were then tested against a more extensive battery of 
PT-NANBH serum samples, including numerous human clinical samples. The sensitivity of the antigens, in decreasing 
order, for reactivity with the serum used for screening was as follows: 33cu > 33c > 409-1-1 (c-a) > 409-1 -1 -F1 R2 > 409- 
1-1(abc) ~ 409-1-1a > 5-1-1 > 409-1 -1(c+2 70). As can be seen from these results all of the alternative clones, with the 
exception of 409-1 -1 (c+270), provided a more sensitive antigen than 5-1 -1 . However, although 33cu and 33c were very 
sensitive antigens, in this assay they reacted slightly with serum which was known to be negative for HCV and may 
therefore be less specific. Accordingly, the 409-1-1 series appears preferable for use as diagnostic antigens since they 
are more specific to HCV-induced antibodies. 

[0066] The immunoscreening was extended to include the clone 36 and 45 encoded epitopes: the insert of clone 
45 is essentially the same as the insert of clone 40 (Example 4). As can be seen from the results presented in Table 1 1 , 
the antigens produced by clones 36 and 40, while not as sensitive as 409-1-1 (c-a), do yield HCV-specific immunopos- 
itive signals with selected samples. Accordingly, the two methods presented in the present invention, (i) immunoscreen- 
ing of cDNA libraries generated directly from sera-derived RNA, and (ii) immunoscreening of amplrfted-fragment 
libraries, can both be seen to be effective methods of identifying cDNA sequences encoding viral antigens. Further, 
confirmation of the clone 36 and 40 encoded antigens by identification of antigens corresponding to these HCV regions 
using the amplif ied-fragment library method validates the usefulness of the amplified-fragment method. 

B. Peptide Purification 

[0067] The recombinant peptides of the present invention can be purified by standard protein purification proce- 
dures which may include differential precipitation, molecular sieve chromatography, ion-exchange chromatography, iso- 
electric focusing, gel electrophoresis and affinity chromatography. In the case of a fused protein, such as the beta- 
galactosidase fused proteins prepared as above, the fused protein can be isolated readily by affinity chromatography, 
by passing cell lysis material over a solid support having surface-bound anti-beta-galactosidase antibody. For example, 
purification of a beta-galactosidase/fusion protein, derived from 409-1-1 (c-a) coding sequences, by affinity chromatog- 
raphy is described in Example 14. 

[0068] A fused protein containing the 409-1 -1(a) peptide fused with glutathione-S-transferase (Sj26) protein has 
also been expressed using the pGEX vector system in E. coli KM392 cells (Smith). This expression system has the 
advantage that the fused protein is generally soluble and therefore can be isolated under non-denaturing conditions. 
The fused Sj26 protein can be isolated readily by glutathione substrate affinity chromatography (Smith). This method of 
expressing this fusion protein is given in Example 15 and is applicable to any of the other antigen coding sequences 
described by the present invention. 

[0069] Also included in the invention is an expression vector, such as the lambda gt1 1 or pGEX vectors described 
above, containing the 409-1-1 (a) coding sequence and expression control elements which allow expression of the cod- 
ing region in a suitable host. The coding sequence is contained in the sequence given above corresponding to basepa- 
irs 2755-3331 of the HCV genome. The control elements generally include a promoter, translation initiation codon, and 
translation and transcription termination sequences, and an insertion site for introducing the insert into the vector. In the 
case of the two vectors illustrated in Example 1 5, the control elements control the synthesis of the protein which is fused 
with the heterologous peptide antigen. Such expression vectors can be readily constructed for the other antigen coding 
sequences described by the present invention. 
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[0070] The lambda gtl 1 vectors containing the following coding regions have been deposited with The American 
Type Culture Collection, 12301 ParWawn Dr., Rockville MD, 20852: the 409-1 -1(abc) coding region, designated gt11/- 
409-1-1(abc). ATCC No. 40876; the 409-1-1(c-a) coding region, designated gtl 1/409-1 -1(c-a) ATCC No. 40792; clone 
36, designated gtl 1/36, ATCC No. 40901 ; and, clone 40, designated gtl 1/40, ATCC No. 40893. 

5 

VII. Immunoreactive Clones of the HCV-Caosid Antigen 

[0071] At the 1990 Congress of Hepatology a region of the full-length HCV nucleic acid sequence was presented, 
nucleotide residues 325-970, containing the HCV non-coding, structural core protein and envelope protein coding 

10 sequences as capsid parts of a polyprotein sequence. During the course of experiments performed in support of the 
present invention, the coding region that corresponds to the capsid protein was more clearly defined. 
[0072] Polymerase Chain Reaction primers were constructed from selected HCV sequence which would generate 
amplification products of nucleotides 325-970 of the full length HCV genome (see Appendix). These primers, SF2(C) 
and SR1(C), are presented in Example 16. The primers contained non-complementary sequences which encoded 

is restriction enzyme cleavage sites to facilitate subsequent cloning manipulations. The primers were used in amplification 
reactions containing SISPA-amplif ied HCV cDNA molecules (Example 7) as substrate. The resulting amplification prod- 
ucts were cloned into the pGEX and pET vectors (Example 16). The pGEX vector allows expression of inserted coding 
sequences as fusion proteins to the Sj26 protein, glutathione-S-transferase. Insertion into the pET vector allows expres- 
sion of the inserted coding sequences independent of fusion sequences. 

20 [0073] These clones were then immunologically screened using sera known to be reactive with HCV-arrtigens 
(Example 17). Several clones in both vectors were identified which were immunoreactive with the anti-HCV sera (in 
pGEX, clones 14, 15, 56, 60, and 65, Example 17, Table 13). It was observed that the fusion proteins which were pro- 
duced from the clones in pGEX were smaller than expected. 

[0074] Clone 15 was selected for scaled up production of the Sj26/HCV-antigen fusion protein. The fusion protein 
25 product (approximately 29 kd) was smaller than the expected fusion product (approximately 50 kd, Example 17). Fur- 
ther, the yield of the fusion protein from this preparation was unexpectedly low. 

[0075] Clones 1 5 and 56 were chosen for nucleic acid sequencing of the HCV-antigen containing inserts (Example 
18). The sequences of the two clones were very similar with the exception that clone 15 had a termination codon start- r 
ing at nucleotide position 126. This result suggested that the amino terminal 42 amino acids encoded by the HCV insert : 

30 were immunogenic in regard to the anti-HCV sera used for immunoscreening. \ 
[0076] To test the suggestion that the amino terminus of the HCV polyprotein was antigenic, a synthetic oligopep-* 1 
tide was constructed essentially corresponding to amino acid residues 6-24 of Figure 8A: this peptide had very strong 
immunoreactivity with anti-HCV sera as tested by ELISA. PCR primers (Figure 8, C1 and NC105) were designed^to 
generate a clone corresponding to this region (Figure 10, C1NC105, SEQ ID NO:25). Three other synthetic peptides 

35 were tested, one of which was strongly immunoreactive with anti-HCV sera (amino acid residues 47-74, Figure 8A) and 
two which were weakly immunoreactive (amino acid residues 39-60 and 101-121, Figure 8A). These synthetic peptides 
confirm the presence of a strong antigenic region at the amino-terminal end of the HCV polyprotein in the capsid protein 
region. 

[0077] The sequence of clone 56, designated pGEX-GG1-56, is shown in Figure 8A and is presented in the 
40 sequence listing as SEQ ID NO:1 1 . The sequence shows that the cone has a long, open reading frame. When produc- 
tion of the fusion protein was induced, a fusion protein smaller than the expected product was produced, similar in size 
to the clone 15 product. The nucleotide sequence of the clones revealed a region which is prone to translation^ 
frameshifting, AAAAAAAAAA (Atkins et al., Wilson et al.). Such a nucleotide sequence may contribute to the low protein 
yields when these clones are expressed in E. coli. In an effort to improve the level of fusion protein expression the third 
45 nucleotide position of several codons through this region was changed to a G resulting in the sequence AGAAGAAGAA 
(Example 20): the changes had no effect on the protein coding sequence (amino acid residues 8-10, Figure 8A). This 
modified insert was cloned into the pGEX vector and the resulting plasmid named pGEX-CapA. 
[0078] A hydropathicity plot was generated for the protein coding sequences of the insert of pGEX-GG1 (Example 
19, Figure 9). The results of this analysis indicated that the carboxy-terminal region of the encoded protein, approxi- 
so mately amino acid residues 168-182, had the potential for being a membrane spanning segment. Since it was unlikely 
that the membrane spanning segment would provide a strong antigen and since overproduction of proteins with these 
regions can adversely affect the growth of bacterial celts, a series of carboxy terminal deletions were generated from 
pGEX-CapA (Example 20). 

[0079] To generate the carboxy terminal deletions PCR primers were designed to be complementary to various 
55 regions of the pGEX-CapA insert encoded protein. The primers used to generate the carboxy terminal deletions are 
given in Table 14 and the location of the primers relative to the insert coding sequence is presented in Figure 8B. The 
carboxy terminal deletion fragments were cloned into the pGEX vector and Sj26/HCV-insert fusion proteins were pro- 
duced. These fusion proteins were then screened with anti-HCV sera and an epitope map generated for the immuno- 
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reactive polypeptides (see Figure 10). Clones C1NC270, C1NC360, and C1NC450 all expressed high levels of the 
Sj26/HCV fusion proteins. Further, these fusion proteins all corresponded to the size predicted from their nucleic acid 
coding sequences. Clones C1NC520 and C1NC580 gave poor yields of fusion proteins suggesting that when the 
hydrophobic region of amino acid residues 168-182 is present it may in part be responsible for the poor protein yields 
previously obtained. 

[0080] The deletion analysis was continued to further dissect the antigenic regions of the pGEX-CapA encoded 
HCV antigen. A series of amino terminal deletions (primers in Table 1 5) combined with carboxy terminal deletions were 
generated using PCR primers: the locations of all the primers are illustrated in Figure 8B. 

[0081] The results of the deletion analysis are presented in Table 16 and in Figure 10. These results, combined with 
the synthetic peptide data presented above, suggest that the capsid protein (which comprises the N-terminus of the 
HCV polyprotein) has two dominant immunoreactive regions. Both of these immunoreactive regions are useful use as 
diagnostic antigens. The region comprising the first 35 amino acids spans one of the epitopes and the region spanning 
residues 34-90 encompasses the other strongly immunoreactive domain. 

[0082] In summary, all of the pGEX clones containing the N-terminus of the HCV polyprotein and either 34, 90, 120 
or 150 residues produced large quantities of fusion protein which, was shown to be efficiently recognized by HCV pos- 
itive sera. Expression of the PCR inserts containing amino acid residues 34-90 was also strongly immunoreactive, 
whereas inserts encoding residues 90-120 or 90-150 were not immunoreactive, demonstrating that these regions were 
not recognized by human sera. This result suggests that the regions important for the production of recombinant anti- 
gens is contained between residues 1 through 90. 

[0083] Analyses of the pGEXCl NC450 protein and the pET360 protein showed that the inclusion of these antigens 
in Western and ELISA formats permitted the identification of HCV positive sera which had been previously identified as 
either HCV negative or HCV indeterminate. Accordingly, the inclusion of these .epitopes permits the generation of an 
improved screening system (Example 21). 

VIII. Anti-HCV Antigen Antibodies 

[0084] In another aspect, the invention includes antibodies specific against the recombinant antigens of the present 
invention. Typically, to prepare antibodies, a host animal, such as a rabbit, is immunized with the purified antigen or 
fused protein antigen. The host serum or plasma is collected following an appropriate time interval, and this serum is 
tested for antibodies specific against the antigen. Example 15 describes the production of rabbit serum antibodies 
which are specific against the 409-1-1 antigens in the Sj26/409-1-1(a) and beta-galactosidase/409-1-1(c-a) fusion pro- 
tein. These techniques are equally applicable to the other antigens of the present invention. 

[0085] The gamma globulin fraction or the IgG antibodies of immunized animals can be obtained, for example, by 
use of saturated ammonium sulfate or DEAE Sephadex, or other techniques known to those skilled in the art for pro- 
ducing polyclonal antibodies. 

[0086] Alternatively, the purified antigen or fused antigen protein may be used for producing monoclonal antibodies. 
Here the spleen or lymphocytes from an immunized animal are removed and immortalized or used to prepare hybrido- 
mas by methods known to those skilled in the art. To produce a human-human hybridoma, a human lymphocyte donor 
is selected. A donor known to be infected with an HCV virus (where infection has been shown for example by the pres- 
ence of anti-virus antibodies in the blood) may serve as a suitable lymphocyte donor. Lymphocytes can be isolated from 
a peripheral blood sample or spleen cells may be used if the donor is subject to splenectomy. Epstein-Barr virus (EBV) 
can be used to immortalize human lymphocytes or a human fusion partner can be used to produce human-human 
hybridomas. Primary in vitro immunization with peptides can also be used in the generation of human monoclonal anti- 
bodies. 

[0087] Antibodies secreted by the immortalized cells are screened to determine the clones that secrete antibodies 
of the desired specificity, for example, using the Western Wot method described in Example 15. 

IX. Ufflte 

A. Diagnostic Method and Kit 

[0088] The antigens obtained by the methods of the present invention are advantageous for use as diagnostic 
agents for anti-HCV antibodies present in HCV-infected sera; particularly, the 409-1-1 antigens (409-1 -1(abc), 409-1- 
1(c-a), and related antigens (see Table 9); the clone 36 antigen; and, the clone 40 antigen and the capsid antigen. As 
noted above, many of the antigens provide the advantage over known HCV antigen reagents 5-1-1 and C-100 in that 
they are immunoreactive with a wider range of PT-NANBH infected sera, particularly acute-infection sera. This is par- 
ticularly true of combinations of the 409-1-1 antigens with the HCV-core protein antigens as described in Section VII 
above. The antigens 409-1 -1(c-a) and Cap450 have been combined in an ELISA test kit and tested against HCV test 
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kits produced by Abbott and Ortho. The antigens of the present invention consistently identify more HCV+ samples with 
a high degree of specificity which is comparable to or better than the Abbott and Ortho test kits. 
[0089] In one preferred diagnostic configuration, test serum is reacted with a solid phase reagent having a surface- 
bound HCV antigen (or antigens) obtained by the methods of the present invention, e.g., the 409-1 -1(c-a) antigen and 

s the Cap450 antigen. After binding anti-HCV antibody to the reagent and removing unbound serum components by 
washing, the reagent is reacted with reporter-labeled anti-human antibody to bind reporter to the reagent in proportion 
to the amount of bound anti-PT-NANBH antibody on the solid support. The reagent is again washed to remove unbound 
labeled antibody, and the amount of reporter associated with the reagent is determined. Typically, the reporter is an 
enzyme which is detected by incubating the solid phase in the presence of a suitable f luorometric or colorimetric sub- 

10 strate. 

[0090] The solid surface reagent in the above assay is prepared by known techniques for attaching protein material 
to solid support material, such as polymeric beads, dip sticks, 96-well plate or filter material. These attachment methods 
generally include non-specific adsorption of the protein to the support or covalent attachment of the protein, typically 
through a free amine group, to a chemically reactive group on the solid support, such as an activated carboxyl, hydroxyl, 
15 or aldehyde group. 

[0091] In a second diagnostic configuration, known as a homogeneous assay, antibody binding to a solid support 
produces some change in the reaction medium which can be directly detected in the medium. Known general types of 
homogeneous assays proposed heretofore include (a) spin-labeled reporters, where antibody binding to the antigen is 
detected by a change in reported mobility (broadening of the spin splitting peaks), (b) fluorescent reporters, where bind- 
20 ing is detected by a change in fluorescence efficiency, (c) enzyme reporters, where antibody binding effects 
enzyme/substrate interactions, and (d) liposome-bound reporters, where binding leads to liposome lysis and release of 
encapsulated reporter. The adaptation of these methods to the protein antigens of the present invention follows con- 
ventional methods for preparing homogeneous assay reagents. 

[0092] In each of the assays described above, the assay method involves reacting the serum from a test individual 

25 with the protein antigen and examining the antigen for the presence of bound antibody. The examination may involve 
attaching a labeled anti-human antibody to the antibody being examined, either IgM (acute phase) or IgG (convalescent 
or chronic phase), and measuring the amount of reporter bound to the solid support, as in the first method, or may 
involve observing the effect of antibody binding on a homogeneous assay reagent, as in the second method. - 
[0093] Also forming part of the invention is an assay system or kit for carrying out the assay method just described. 

30 The kit generally includes a support with surface-bound recombinant HCV antigen (e.g., the 409-1-1 antigens, etc., as , 
above), and a reporter-labeled anti-human antibody for detecting surface-bound anti-PT-NANBH-antigen antibody. 
[0094] As discussed in Section III above, peptide antigens associated with several of the linking-fragment libraries 
are immunoreactive with acute NANBH sera from chimpanzees, indicating that the peptides would be useful for detect- 
ing acute NANBH infection in human serum. In particular, one or more peptide antigens produced by the linking frag- 

35 ment libraries, Fa/Rg (reactive with chronic sera), F 3 R 4 , F 6 B 12 , F 12 R7, F 7 R 8 or F 7 R 8 (which are shown in Example 1 1 
to produce one or more peptide antigens which are immunoreactive with acute chimpanzee sera) can be combined with 
the 409-1-1 antigens to provide a diagnostic composition capable of immunoreacting with a high percentage of both 
chronic and acute human NANBH serum samples. Further, as discussed in Section VII above inclusion of the HCV-cap- 
sid protein antigens of the present invention add an extra level of sensitivity. 

40 [0095] A third diagnostic configuration involves use of the anti-HCV antibodies, described in Section VI above, 
capable of detecting HCV specific antigens. The HCV antigens may be detected, for example, using an antigen capture 
assay where HCV antigens present in candidate serum samples are reacted with an HCV specific monoclonal antibody. 
The monoclonal antibody is bound to a solid substrate and the antigen is then detected by a second, different labelled 
anti-HCV antibody: the monoclonal antibodies of the present invention which are directed against HCV specific anti- 

45 gens are particularly suited to this diagnostic method. 

B. Peptide Vaccine 

[0096] The HCV antigens identified by the methods of the present invention, e.g. 409-1 -1(c-a) and HCV-core pro- 
so tein antigens, can be formulated for use in a HCV vaccine. The vaccine can be formulated by standard methods, for 
example, in a suitable diluent such as water, saline, buffered salines, complete or incomplete adjuvants, and the like. 
The immunogen is administered using standard techniques for antibody induction, such as by subcutaneous adminis- 
tration of physiologically compatible, sterile solutions containing inactivated or attenuated virus particles or antigens. An 
immune response producing amount of virus particles is typically administered per vaccinizing injection, typically in a 
55 volume of one milliliter or less. 

[0097] A specific example of a vaccine composition includes, in a pharmacologically acceptable adjuvant, a recom- 
binant 409-1 -1(c-a) peptide. The vaccine is administered at periodic intervals until a significant titer of anti-HCV anti- 
body is detected in the serum. Such vaccines can also comprise combinations of the HCV antigens of the present 
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invention. 

C. Passive Immunoprophylaxis 

[0098] The anti-HCV antibodies of the invention can be used as a means of enhancing an anti-HCV immune 
response since antibody-virus complexes are recognized by macrophages and other effector cells. The antibodies can 
be administered in amounts similar to those used for other therapeutic administrations of antibody. For example, pooled 
gamma globulin is administered at 0.02-0.1 ml/lb body weight during the early incubation of other viral diseases such 
as rabies, measles and hepatitis B to interfere with viral entry into cells. Thus, antibodies reactive with, for example, the 
409-1 -1(c-a) antigen can be passively administered alone in a "cocktail" with other anti-viral antibodies or in conjunction 
with another anti-viral agent to a host infected with a PT-NANBH virus to enhance the immune response and/or the 
effectiveness of an antiviral drug. 

[0099] The following examples illustrate various aspects of the invention, but are in no way intended to limit the 
scope thereof. 

Materials 

[0100] E. coli DNA polymerase I (Klenow fragment) was obtained from Boehringer Mannheim Biochemicals (Indi- 
anapolis, IN). T4 DNA ligase and T4 DNA polymerase were obtained from New England Biolabs (Beverly, MA); Nitro- 
cellulose filters were obtained from Schleicher and Schuell (Keene, NH). 

[0101] Synthetic oligonucleotide linkers and primers were prepared using commercially available automated oligo- 
nucleotide synthesizers. Alternatively, custom designed synthetic oligonucleotides may be purchased, for example, 
from Synthetic Genetics (San Diego, CA). cDNA synthesis kit and random priming labeling kits were obtained from 
Boehringer-Mannheim Biochemical (BMB, Indianapolis, IN). 

Example 1 

Constructi on of NANB-containinq cDNA libraries 

A. Infection of a Chimpanzee with HCV 

[0102] A chimpanzee (#771) was inoculated with a Factor VIII preparation which was known to cause parenterally 
transmitted non-A non-B hepatitis (PT-NANBH) in human patients treated with the Factor VIII concentrate (Bradley). 
Post-infection ultrastructural changes in liver tissue were observed by electron microscopy and ALT (alanine amino 
transferase) elevation was observed in the infected chimpanzee. These observations are consistent with PT-NANBH 
infection. 

B. Isolation of RNA from Sera 

[0103] Serum was collected from the above described infected chimpanzee (#771) and four human PT-NANBH 
clinical sources (EGM, BV, CC and WEH). Ten milliliters of each undiluted serum was pelleted by centrifugation at 30K, 
for 3 hours in an SW40 rotor, at 4°C. RNA was extracted from each resulting serum pellet using the following modifica- 
tions of the hot phenol method of Feramisco et al. Briefly, for each individual serum sample, the pellet was resuspended 
in 0.5 ml of 50 mM NaOAc, pH=4.8, containing 1% SDS. An equal volume of 60°C phenol was added and incubated for 
15 minutes at 60°C with occasional vortexing. This mixture was transferred to a 1 .5 ml microfuge tube and spun for two 
minutes at room temperature in a table top microfuge. The aqueous phase was transferred to a new microfuge tube. To 
the aqueous phase, 50 til of 3 M NaOAc, pH=5.2, and two volumes of 100% ethanol were added. This solution was held 
at -70°C for approximately 10 minutes and then spun in a microfuge at 4°C for 10 minutes. The resulting pellet was 
resuspended in 1 00 of sterile glass distilled water. To this solution 1 0 nJ of NaOAc. pH=5.2, and two volumes of 1 00% 
ethanol were added. The solution was held at -70°C for at least 10 minutes. The RNA pellet was recovered by centrif- 
ugation in a microfuge at 12,000 X g for 15 minutes at 5°C. The pellet was washed in 70% ethanol and dried under vac- 
uum. 

C. Synthesis of cDNA 

(i) First Strand Synthesis 

[0104] The synthesis of cDNA molecules was accomplished as follows. The above described RNA preparations 
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were each resuspended in 26 pi of sterile glass distilled water (treated with diethyl pyrocarbonate, Maniatis et al.), 5 pi 
of 10 X reaction buffer (0.5 M Tris HCI, pH=8.5; 0.4 M KCI; 0.1 M MgCI 2 ; 4 mM DTT), 10 pi of a nucleotide solution 
(dGTP, dATP, dTTR and dCTP, each at a concentration of 5 mM), 5 pi random primer, 0.25 pi of 32 P-dCTP. 2 pi AMV 
reverse transcriptase, and 2 pi of RNASIN (Promega), in a total reaction volume of 50 pi. This mixture was incubated 
5 for one hour at 42°C. 

(ii) Second Strand cDNA Synthesis 

[0105] To the first strand synthesis reaction mixture the following components were added: 55 pi of 2 X second 
w strand synthesis buffer (50 mM Tris HCI, pH=7.0; 60 mM KCI); 2 pi RNase H; 5 pi DNA polymerase I, and 2 pi of the 
above described nucleotide solution. The reaction was incubated for one hour at 12°C, followed by a one hour incuba- 
tion at room temperature. The reaction mixture was extracted with an equal volume of 1 :1 phenol/chloroform, followed 
by an extraction using 24:1 chloroform/isoamyl alcohol. To each reaction mixture 1 pi of 10 mg/ml tRNA was added as 
carrier. The cDNA was precipitated by the addition of two volumes of 100% ethanol and chilling at - 70°C for 15 minutes. 
15 The cDNA was collected by centrifugation, the pellet washed with 70% ethanol and dried under vacuum. 

(iii) Preparation of the Double Stranded cDNA for cloning 

[0106] To provide vector compatible ends each of the double stranded cDNA preparations was tailed with EcoRI 
20 linkers in the following manner. 

[0107] The cDNA was treated with EcoRI methylase under the following conditions: The cDNA pellet was resus- 
pended in 20 pi 1x methylase buffer (50 mM Tris HCI, pH=7.5; 1 mM EDTA; 5 mM DTT), 2 pJ 0.1 mM S-adenosyl- 
methionine (SAM) and 2 pJ EcoRI methylase (New England Biolabs). The reaction was incubated for 30 minutes at 
37°C. TE buffer (10 mM Tris-HCI, pH=7.5; 1 mM EDTA, pH=8.0) was added to achieve a final volume of 80 pi. The reac- 
ts tion mixture was extracted with an equal volume of phenol/chloroform (1:1) and then with an equal volume of chloro- 
form/isoamyl alcohol (24:1). The cDNA was precipitated with two volumes of ethanol. 

[0108] To maximize the number of blunt ends for the addition of linkers (Maniatis et al, 1982) the cDNA was then 
treated with the Klenow fragment of DNA polymerase I. The pelleted cDNA was resuspended in 11.5 pJ of distilled * 
water. The following components were added to the resuspended cDNA: 4 pi of 5 X NTB (10 X NTB stock solution: 0.5 
30 M Tric.CI pH=7.2; 0.1 M MgS0 4 ; 1 mM drthiothreitol (DTT); 500 pg/ml bovine serum albumin (BSA)); 3 pi 0.1 M MgCI 2 . 
1 .5 pi 10GATC (a solution containing 10 mM of each nucleotide G. A t T, and C), and 1 pi Klenow (Boehringer Mannheim 
Biochemicals). The reaction mixture was incubated at room temperature for 30 minutes. The reaction mixture was 
extracted with phenol/chloroform and chloroform isoamyl alcohol as described above, and then precipitated with two 
volumes of ethanol. 

35 [01 09] The cDNA pellet was resuspended in 1 2 pi distilled water. To the resuspended linkers the following compo- 
nents were added: 5 pi EcoRI phosphorylated linkers (New England Biolabs). 2 pi 10x ligation buffer (0.66 M Tris.CI 
pH=7.6, 50 mM MgCI 2 , 50 mM DTT, 10 mM ATP) and 1 pJ T4 DNA ligase. The reaction was incubated at 14°C over- 
night. The following morning the reaction was incubated at 67°C for three minutes to inactivate the ligase, then momen- 
tarily chilled. To the ligation reaction mixture 2.5 pi of 10 X high salt restriction digest buffer (Maniatis et al.) and 2.5 pi 

40 of EcoRI enzyme were added and the mixture incubated at 37°C for at least 6 hours to overnight. To remove excess 
linkers the digestion mixture was loaded onto a 1 .2% agarose gel and the reaction components size fractionated by 
electrophoresis. Size fractions of the 0.3-1.3 Kb and 1 .3-7 Kb ranges were electrocuted onto NA45 paper (Schleicher 
and Schuell). The NA45 paper, with the eluted cDNA bound to it, was placed in a 1.5 ml microfuge tube containing 0.5 
ml of elution solution (50 mM arginine, 1 M NaCI, pH=9.0). The tube was then placed at 67°C for approximately one 

45 hour to allow the cDNA to be eluted from the paper into the solution. The solution was then phenol/chloroform, chloro- 
form/isoamyl alcohol extracted and precipitated with two volumes of ethanol. The resulting cDNA pellets were resus- 
pended in 20 pi TE (pH=7.5). 

(iv) Cloning of the cDNA into Lambda Vectors 

50 

[01 1 0] The linkers used in the construction of the cDN As contained an EcoRI site which allowed for direct insertion 
of the amplified cDNAs into lambda gt10 and gt1 1 vectors (Promega, Madison Wl). Lambda vectors were purchased 
from the manufacturer (Promega) which were already digested with EcoRI and treated with bacterial alkaline phos- 
phatase, to remove the 5* phosphate and prevent selHigation of the vector. 
55 [01 1 1 ] The EcoRMinkered cDNA preparations were ligated into both lambda gt1 0 and gt1 1 (Promega). The condi- 
tions of the ligation reactions were as follows: 1 pi vector DNA (Promega, 0.5 mg/ml); 0.5 or 3 pi of insert cDNA; 0.5 pi 
10 X ligation buffer (0.5 M Tris-HCI, pH=7.8; 0.1 M MgCfe; 0.2 M DTT; 10 mM ATP; 0.5 g/ml BSA), 0.5 pi T4 DNA ligase 
(New England Biolabs) and distilled water to a final reaction volume of 5 pi. 
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[0112] The ligation reaction tubes were placed at 14°C overnight (12-18 hours). The ligated cDNA was packaged 
the following morning by standard procedures using a lambda DNA packaging system (GIGAPAK, Stratagene, LaJoIla. 
CA), and then plated at various dilutions to determine the titer and recombinant frequency of the libraries. A standard 
X-gal blue/white assay was used to screen the lambda gt11 libraries (Miller; Maniatis et al.). E. coli HG415 (from 
5 Howard Gersenfeld, Dept.of Pathology, Stanford School of Medicine) plating bacteria, which allows only plaque forma- 
tion by recombinant clones, was used for plating the lambda gtIO libraries. The standard strain, E. coli C600hF- may 
be used as an alternative to E. coli HG415. 

Example 2 

10 

Screening the cDN A library for production of PT-NANBH antigens 

[0113] The five lambda gt1 1 libraries generated in Example 1 were screened for specific HCV encoded viral anti- 
gens by immunoscreening. The phage were plated for plague formation using the Escherichia coli bacterial plating 
is strain E. coli KM392 (Kevin Moore, DNax, Palo Alto. CA). Alternativey, E. coli Y1088 may be used.. The fusion proteins 
expressed by the lambda gt11 clones were screened with serum antibodies (Young et al.) from the following sources: 
chimpanzee #771 and various human PT-NANBH sera (including EGM, BV, WEH and AG). 

[01 14] From the lambda gt1 1 libraries (Example 1) approximately 1 1 1 independent clones gave a positive immuno- 
logical reaction with at least one of the chimp or human PT-NANBH sera. These phage clones were plague purified and 
20 the recombinant phage grown for DNA purification (Maniatis et al.). 

Example 3 

Genomic Hybridization Screening of Immunopositive Clones 

25 

[0115] Out of the 1 11 plaque purified recombinant phage, obtained as in Example 2, 93 were isolated (Maniatis et 
al.) and digested with EcoRI as per the manufacturer's instructions (Bethesda Research Laboratories, Gaithersburg, 
MD). Approximately 1 .0 microgram of each digested phage DNA sample was loaded into sample wells of 1 .0% agarose 
gels prepared using TAE (0.04 m Tris Acetate, 0.001 M EDTA). The DNA samples were then electrophoretically sepa- 
30 rated. DNA bands were visualized by ethidium bromide staining (Maniatis et al.). Inserts were clearly identified for each 
of the 93 clones, purified by electrocution using NA45, and then radioactively labelled by nick translation (Maniatis et 
al.). 

[0116] Human peripheral blood lymphocyte (PBL) DNA was restriction digested with Hindlll and EcoRI, loaded on 
a 0.7% agarose gel (as above, except 10 pig of DNA was loaded per lane) and the fragments separated electrophoret- 
35 ically. The DNA fragments in the agarose gels were transferred to nitrocellulose filters (Southern) and the genomic DNA 
probed with the nick-translated lambda gt1 1 inserts which were prepared above. 

[01 1 7] The filters were washed (Southern; Maniatis et al.) and exposed to X-ray film. Forty-three of the 93 lambda 
clone inserts displayed a positive hybridization reaction with the human PBL DNA. Among the remaining inserts which 
clearly did not hybridize with the PBL DNA, were 1 1 inserts derived from chimp #771 clones which were also clearly 
40 immunopositive from Example 2. Of these 1 1 clones, two of the clones had the immunoreactive characteristics summa- 
rized in Table 1. Chimpanzee #771 and humans Ag, BV and WEH were chronimc PT-NANBH sera samples and SKF 
was a normal human serum sample. 



45 Table 1 



Sera 


Clone Designation 




36 


40 


#771 


+ 


+ 


AG 


+ 




BV 


+ 




WEH 






SKF 







Clone 40 (original clone screening designation 304-12-1) was clearly exogenous, i.e., not derived from normal human 
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DNA, as evidenced by repeated hybridization tests against normal human PBL DNA, and a second clone, designated 
clone 36 (original clone screening designation 303-1-4), was not only exogenous but also reactive with multiple PT- 
NANBH antisera. 

s Example 4 

Sequencing of Clones 

[0118] DNA sequencing was performed on clones 36 and 40 as described in Example 3. Commercially available 
10 sequencing primers (New England Biolabs) homologous to flanking lambda sequences at the 5' and 3* ends of the 
inserts were initially used for sequencing. As sequencing progressed primers were constructed to correspond to newly 
discovered sequences. Synthetic oligonucleotide primers were prepared using commercially available automated oligo- 
nucleotide synthesizers. Alternatively, custom designed synthetic oligonucleotides may be purchased, for example, 
from Synthetic Genetics (San Diego, CA). 
is [01 19] DNA sequences were determined for the complete insert of clone 40 (presented as SEQ ID NO:1 and also 
shown in Figure 3); this sequence corresponds to nucleotides 6516 to 7070 of the HCV genome (Appendix). Subse- 
quently, the inserts present in clones 44 and 45 (2 other clones of the 1 1 clones identified in Example 3) were found to 
cross-hybridize to the clone 40 insert. Partial sequencing of clones 44 and 45 showed that the sequences obtained from 
these two clones matched the sequence of clone 40. A partial sequence of the clone 36 insert was determined and is 
20 presented as SEQ ID NO: 3; the complete sequence is presented as SEQ ID NO:5 and is also shown in Figure 4. The 
sequence of clone 36 corresponds to nucleotides 5010 to 6515 given in the Appendix. 

Example 5 

25 Screening of the cDNA library in lambda ot10 

[0120] , The cDNA libraries in lambda gt10, generated in Example 1 , were screened for the presence of sequences 
homologous to the clone 40 insert. *i 
[01 21 ] The lambda gt1 0 libraries were plated at a density of approximately 1 0 4 plaques/plate and plaques lifts were 

30 prepared according to Maniatis et at. Filters were indexed using india ink to allow alignment of the filters with the parent 
plate from which the plaque lift was performed. The bacteria and phage particles were lysed, and the nitrocellulose fil- • 
ters were processed and baked as previously described (Maniatis et al.). The prehybridization solution, per filter, con- 
sisted of the following: 5.4 ml prehybridization buffer (50 ml of 1 M Tris HCI, pH=7.5; 2 ml of 0.5 M EDTA, pH=8.0; SO^ml 
of 10% SDS; 150 ml of 20 X SSC (Maniatis et a).); and, 238 ml of glass distilled water); 6.0 ml formamide; 0.4 ml 50 X 

35 Denhardt solution (5 g FICOLL; 5 g polyvinylpyrrolidone; 5 g bovine serum albumin; brought to a total volume of 500 ml 
with glass distilled water); and 0.2 ml of single-stranded salmon sperm DNA (10 mg/ml). Each filter was placed in a 
plastic bag and the prehybridization solution was added. The bag was sealed and incubated at 37°C overnight with 
intermittent mixing of contents. 

[0122] The clone 40 lambda DNA was isolated (Maniatis et al.) and digested with EcoRL The resulting fragments 
40 were fractionated on an agarose gel and visualized by ethidium bromide staining (Maniatis et al.). The DNA fragment 
corresponding to the clone 40 insert, approximately 500 base pairs, was isolated from the agarose by electroelution 
onto NA45. The aqueous suspension of the purified fragment was extracted once with a 1 :1 phenol/chloroform solution, 
and once with a 24:1 chloroform/isoamylalcohol solution. The DNA was then precipitated with ethanol and resuspended 
in sterile water. 

45 [01 23] The clone 40 insert was radioactively labelled by nick translation and used to probe the lambda gt10 plaque 
lift filters. The prehybridization solution was removed from the filters. Each filter was hybridized with probe under the fol- 
lowing conditions: 5.0 ml of hybridization buffer (5 ml of 1 M Tris HCI, pH=7.5; 0.2 ml of 0.5 M EDTA, pH=8.0; 5.0 ml of 
10% SDS; 14.9 ml of 20 X SSC (Maniatis et a!.); 10 g of dextran sulfate; and, glass distilled water to a total volume of 
50 ml); 5.0 ml formamide; 0.4 ml 50 X Denhardt's solution (5 g FICOLL; 5 g polyvinylpyrrolidone; 5 g bovine serum albu- 

so min; brought to a total volume of 500 ml with glass distilled water); and 0.2 ml of single-stranded salmon sperm DNA 
(10 mg/ml). To this hybridization mix was added 50-250 uJ of denatured probe (boiled 5-10 minutes and quick-chilled on 
ice), resulting in approximately 10 6 cpm of labelled probe per filter. The hybridization mix containing the labelled probe 
was then added to the plastic bag containing the filters. The bag was resealed and placed under a glass plate in a 37°C 
water bath overnight with intermittent mixing of contents. 

55 [01 24] The next day the hybridization solution was removed and the filters washed three times, for 5 minutes each, 
in 2 X SSC (Maniatis et al.) containing 0.5% SDS, at room temperature. The fitters were then washed for one hour in 2 
X SSC, containing 0.5% SDS. at 50°C. The filters were then washed for 15-60 minutes in 0.1 X SSC. containing 0.1% 
SDS, at 50°C and finally 2 X SSC, 15 minutes. 2-3 X at room temperature. The washed filters were dried and then 
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exposed to X-ray film for detection of positive plaques. 

[0125] Approximately 24 plaques from the lambda gtlO libraries were plaque purified from the approximately 200 
plaques which tested positive by the hybridization screen (Table 2). 



Table 2 



Library 


cDNA Source 


Positives/Plate 


EGM 


Human 


=50 


BV 


Human 


= 100 


WEH 


Human 


=25 


#771 


Chimp 


=10-15 



Example 6 

Analysis of lambda atlO cDNA Library Clones Homologous to the Clone 40 insert 

[0126] The clones identified in Example 5 which have homology to the clone 40 insert were analyzed by standard 
restriction analysis and the insert sizes were determined. The original frequencies of positive hybridization signals per 
plate using the clone 40 insert as probe against the different cDNA sources are shown in the last column of Table 2. 
That these positive signals arose with different frequencies for the different cDNA sources in the lambda gt10 library 
suggests that the hybridization signals originated from the sera source rather than common contamination introduced 
during cDNA synthesis or cloning. 

[0127] One of the clones (1 08-2-5) from the EGM -generated cDNA library identified by hybridization with the clone 
40 insert; had an insert of approximately 3.7 kb and was chosen for further analysis. The insert was isolated by EcoRI 
digestion of the clone, electrophoretic fractionation, and electrocution (Example 5). The insert was treated with DNase 
I under conditions resulting in partial digestion (Maniatis et al.) to generate random fragments. The resulting fragments 
were inserted into lambda gt1 1 vectors for expression. The lambda gt1 1 clones were then immunoscreened (Example 
2) using human (BV and normal) and chimpanzee #771 sera. Twelve positive clones were identified by first round immu- 
noscreening with the human and chimp sera. Seven of the 12 clones were plaque purified and rescreened using chimp 
serum (#771). Partial DNA sequences of the insert DNA were determined for two of the resulting clones that had the 
largest sequences, designated 328-16-1 and 328-16-2. The 2 clones had sequences essentially identical to clone 40. 

Example 7 

Preparing Amplified HCV cDNA Fragments 
A. Preparing cDNA fragments 

[0128] A plasma pool obtained from a chimpanzee with chronic PT-NANBH was obtained from the Centers for Dis- 
ease Control (CDC) (Atlanta, GA). After direct pelleting or PEG precipitation, RNA was extracted from the virions by 
guanidinium thiocyanate-phenol-chloroform extraction, according to published methods (Chomczynski). The pelleted 
RNA was used for cDNA synthesis using oligo dT or random primers, or HCV sequence-specific primers and a com- 
mercial cDNA kit (Boehringer-Mannheim). 

[0129] In one method, synthesis of first strand cDNA was achieved by addition of four primers, designated A, B. C, 
and D, having the sequences shown below. These sequences are complementary to the HCV genomic regions indi- 
cated: 

A: 5 , -GCGGAAGCAATCAGTGGGGC-3\ complementary to basepairs 394-413; 
B: 5'-GCCGGTCATGAGGGCATCGG-3\ complementary to basepairs 2960-2980; 
C: 5*-CGAGGAGCTGGCCACAGAGG-3\ complementary to basepairs 5239-5258; and 
D: 5'-TGGTTCTATGGAGTAGCAGGCCCCG-3\ complementary to basepairs 7256-7280. 

[0130] Second strand cDNA synthesis was performed by the method of Gubler and Hoffman. The reactions were 
carried out under standard cDNA synthesis methods given in the commercial kit. 
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B. Amplifying the cDNA Fragments 

[0131] The cDNA from above was blunt ended and ligated to the linker/primer having the following sequence: 

5 Linker/primer: 5'-GGA ATT CGC GGC CGC TCG-3' A-strand 

3-TT CCT TAA GCG CCG GCG AGC-5' B-strand 

The cDNA and linker were mixed at a 1 :100 molar ratio in the presence of 0.3 to 0.6 Weiss units of T4 DNA ligase. To 
100 *tl of 10 mM Tris-CI buffer, pH 8.3, containing 1.5 mM MgCI 2 and 50 mM KCI (Buffer A) was added about 1 x 10-3 

w ng of the linker-ended cDNA, 2 uM of linker/primer A (A-strand) having the sequence dfS'-GGAATTCGCG- 
GCCGCTCG-3'), 200 each of dATR dCTP, dGTP, and dTTR and 2.5 units of Therm us aquaticus DNA polymerase 
(Taq polymerase). The reaction mixture was heated to 94°C for 30 sec for denaturation, allowed to cool to 50°C for 30 
sec for primer annealing, and then heated to 72°C for 0.5-3 minutes to allow for primer extension by lag polymerase. 
The replication reaction, involved successive heating, cooling, and polymerase reaction, was repeated an additional 25 

15 times with the aid of a Perkin-Elmer Cetus DNA thermal cycler. This results in a pool of SISPA (sequence-independent 
single primer amplif ication)-amplif ied DNA fragments. 

Example 8 

20 Preparing Primer-Pair Fragments 

[0132] Amplified cDNA fragments from Example 7 were mixed with IOO.uJ Buffer A, 1 uM of equal molar amounts 
of one of the primer pairs given below, 200 \M each of dATR dCTP, dGTP, and dTTR and 2.5 units of Thermus aquat- 
icus DNA polymerase (Taq polymerase). Each primer pair includes a forward (upstream) primer F } which is identical to 
25 the coding strand at the upstream end of an overlap region Pj of duplex genomic DNA and a reverse primer R f which is 
complementary to the coding at the downstream end of the region Pj. The sets of primers each define an overlap region 
of about 200 basepairs, and the spacing between adjacent overlapping primer regions (i.e., between adjacent pairs of 
Fj/Rj pairs) is about 0.5-1 kilobase. The regions of HCV which are complementary to the primers are given below: 

30 Ft , basepairs 1 83-201 ; . basepairs 361 -380 
F 10 , basepairs 576-595; R 10 . basepairs 841-860 

F 2 , basepairs 1080-1100; R 2 , basepairs 1254-1273 ^ 

F 3 , basepairs 1929-1948; R 3 , basepairs 2067-2086 ,►/. 

F 4 , basepairs 2754-2733; R 4 , basepairs 2920-2940 
35 F 5 , basepairs 3601-3620; R 5 . basepairs 3745-3764 

F 6 , basepairs 4301-4320; Re, basepairs 4423-4442 

F 12 , basepairs 4847-4865; R 12 , basepairs 4715-4734 

F 7 , basepairs 5047-5066; R 7 , basepairs 5200-5216 

F 8 , basepairs 5885-5904; R 8 , basepairs 6028-6047 
40 F 9 , basepairs 6902-6921 ; R 9 , basepairs 7051 -7070 

[0133] Polymerase Chain Reaction (PCR) amplification of the SISPA-amplrfied cDNA fragments with each F/Rj 
primer pair was carried out under conditions similar to those used above, with about 25 cycles. 
[0134] The amplified fragment mixtures from above were each fractionated by electrophoresis on 1.5% agarose 
45 and transferred to nitrocellulose filters (Southern). Hybridization of the nitrocellulose-bound fragments, each with an 
internal-sequence oligonucleotide probe confirmed that each fragment contained the expected sequences. Hybridiza- 
tion was carried out with an internal oligonucleotide radiolabeled by polynucleotide kinase, according to standard meth- 
ods. 

so Example 9 

Preparing Unking Fragments 

[0135] This example describes preparing large overlapping linking fragments of the HCV sequence. SISPA-ampli- 
55 fied cDNA fragments from Example 7 were mixed with 100 \i\ Buffer A, 1 jiM of equal molar amounts of forward and 
reverse primers in each of the primer pairs given below, 200 jiM each of dATP f dCTP, dGTP, and dTTP, and 2.5 units of 
Thermus aquaticus DNA polymerase (Taq polymerase), as in Example 8. Each primer pair includes a forward primer Fi 
and a reverse primer Rj, where Fj is the forward primer for one overlap region Pj, and Rj is the reverse primer of the adja- 
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cent overlap region. Thus each linking fragment spans two adjacent overlap regions. The sets of primers each define a 
linking fragment of about 0.5-1 kilobases. The sequences of the primer pairs are given in Example 8. The overlapping 
linking fragments of the HCV sequence (Appendix) spanned by each primer pair is given below: 

basepairs 183-860 
F 10 /R 2 , basepairs 576-1273 
F2/R3, basepairs 1080-2086 
F 3 /R 4 , basepairs 1929-2940 
F4/R5, basepairs 2754-3762 
Fs/Re, basepairs 3601-4442 
Fe/R 12 , basepairs 4301-4865 
F 12 /R 7 , basepairs 4715-5216 
F 7 /R 8 , basepairs 5047-6047 
Fg/Rg, basepairs 5885-7070 

[0136J Two-primer amplification of the SISPA-amplified cDNA fragments with each F/Rj primer pair was carried out 
under conditions similar to those described above, with about 25 cycles. 

[0137] The amplified fragment mixtures from above were each fractionated by agarose electrophoresis on 1.2 % 
agarose, and transferred to nitrocellulose filters (Southern) for hybridization with radiolabeled internal oligonucleotide 
probes as above. The analysis confirmed that each linking fragment contained the two end-primer sequences from 
adjacent overlap regions. The sequences contained in each of the linking fragments are indicated in the Appendix. 

Example 1Q 

Preparing Cloned Peptide Fragments 

A. DNA Fragment Digestion 

[01 38] Each of the ten linking fragments from Example 9 was suspended in a standard digest buffer (0.5M Tris HCI, 
pH 7.5; 1 mg/ml BSA; lOmM MnCl2) to a concentration of about 1 mg/ml and digested with DNAse I at room temper- 
ature for various times (1 -5 minutes). These reaction conditions were determined from a prior calibration study, in which 
the incubation time required to produce predominantly 1 00-300 basepair fragments was determined. The material was 
extracted with phenol/chloroform before ethanol precipitation. 

[0139] The fragments in the digest mixture were blunt-ended and ligated with EcoRI linkers. The resultant frag- 
ments were analyzed by electrophoresis (5-IOV/cm) on 1.2% agarose gels, using PhiX174/Haelll and lambda/Hindlll 
size markers. The 100-300 bp fraction was eluted onto NA45 strips (Schleicher and Schuell), which were then placed 
into 1 .5 ml microtubes with eluting solution (1 M NaCI, 50 mM arginine, pH 9.0), and incubated at 67°C for 30-60 min- 
utes. The eluted DNA was phenol/chloroform extracted and then precipitated with two volumes of ethanol. The pellet 
was resuspended in 20 pi TE buffer (0.01 M Tris HCI, pH 7.5, 0.001 M EDTA). 

B. Cloning the Digest Fragments 

[0140] Lambda gt1 1 phage vector (Young et al.) was obtained from Promega Biotec (Madison, Wl). This cloning 
vector has a unique EcoRI cloning site 53 base pairs upstream from the beta-galactosidase translation termination 
codon. The partial digest fragments from each linking fragment in Part A were introduced into the EcoRI site by mixing 
0.5-1 .0 *ig EcoR/-cleaved lambda gt1 1 , 0.3-3 uJ of the above sized fragments, 0.5 \i\ IOX ligation buffer (above), 0.5 nl 
DNA ligase (200 units), and distilled water to 5 The mixture was incubated overnight at 14°C, followed by in vitro 
packaging, according to standard methods (Maniatis, pp. 256-268). 

[0141] The packaged phage were used to infect E. coli strain KM392, obtained from Dr. Kevin Moore, DNAX (Palo 
Alto, CA). Alternatively, E. coli strain Y1090, available from the American Type Culture Collection (ATCC No. 37197), 
could be used. A lawn of KM392 cells infected with about 10 3 -10 4 pfu of the phage stock from above was prepared on 
a 150 mm plate and incubated, inverted, for 5-16 hours at 27°C. The infected bacteria were checked for loss of beta- 
galactosidase activity (clear plaques) in the presence of X-gal using a standard X-gal substrate plaque assay method 
(Maniatis). 

[0142] Identification of single plaques containing a digest-fragment insert was confirmed as follows. Clear single 
plaques (containing the progeny of a single phage) were removed from the plate and suspended in extraction buffer 
(Maniatis) to release the phage DNA. The phage extract was added to the above DNA amplification mixture in the pres- 
ence of primers which are about 70 basepairs away in either direction from the EcoRI site of lambda gt1 1. Thus phage 
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containing a digest-fragment insert wilt yield an amplified digest fragment of about 140 basepairs plus insert. Phage 
DNA amplification was carried out as described above, with 25 cycles of amplification. The reaction material from each 
plaque tested was fractionated on 1 .5% agarose, and examined for the size of amplified digest fragments. Non-recom- 
binant phage gave a 140 basepair band, and recombinant phage, a band which is 140 basepair plus the insert 
5 sequence in size. The results are shown in column 2 (REC Freq) of Table 2 below, for the six linking-fragment libraries 
indicated in the first column in Table 3 below. The denominator in the column-2 entries is the total number of plaques 
assayed by primer amplification. The numerator is the number of clear plaques containing fragment inserts. Thus, 3/15 
means that 3 plaques tested positive by PCR out of a total of 15 clear plaques assayed. 



Table 3 



Library 1 


REC Freq 2 


1° Screen 3 


PA/REC 4 


F2R3#2 


3/15 


2 


0.33 


F3R4 #1 


7/12 


0 




F4R5#3 


9/10 


10 


0.37 


F5R6 #5 


11/12 


37 


1.35 


F7R8 #7 


0/12 


1 




F8R9 #10 


3/12 


58 


7.73 



1 - Libraries constructed by partial DNasel Digestion of 
indicated linking clone 

2 - Recombinant frequency determined by PCR with 



25 insert flanking lambda gt1 1 primers 

3 - Primary screening with chronic human PT-NANBH 
serum (1:100) on 1.5X10 phage 

4 - PA/REC indicates the number of positive areas 
detected per actual number of recombinant plated 

30 

[0143] The library of digest fragments constructed for each linking fragment was screened for expression of pep- 
tides which are immunoreactive with a human PT-NANBH serum. The lawn of phage-infected bacteria was overlaid with 
a nitrocellulose sheet, transferring PT-NANBH recombinant peptides from the plaques to filter paper. The plate and filter 
were indexed for matching corresponding plate and filter positions. 

35 [0144] The filter was removed after 6-12 hours, washed three times in TBS buffer (10 mM Tris, pH 8.0, 150 mM 
NaCI), blocked with AlB (TBS buffer with 1% gelatin), washed again in TBS, and incubated overnight with of antiserum 
(diluted to 1:100 in AlB, 12-15 ml/plate). The sheet was washed twice in TBS and then incubated with alkaline-phos- 
phatase-conjugated anti-human IgG to attach the labeled antibody at filter sites containing antigen recognized by the 
antiserum. After a final washing, the filter was developed in a substrate medium containing 33 jil NBT (50 mg/ml stock 

40 solution maintained at 4°C) mixed with 1 6 uJ BCIP (50 mg/ml stock solution maintained at 4°C) in 5 ml of alkaline phos- 
phatase buffer (100 mM Tris, 9.5, 100 mM NaCI, 5 mM MgC12). Reacted substrate precipitated at points of antigen pro- 
duction, as recognized by the antiserum. 

[0145] The total number of plaques which showed antigen-positive reaction (positive areas PA) in the primary 
screen are given in the third column in Table 3. The fourth column in the table is the frequency of positive areas per total 
45 number of recombinant phage screened (x 10 3 ). This last column is therefore a measure of the relative immunogenicity 
of antigen expressed from a particular linking fragment using this particular serum sample. 

Example 1 1 

so Screening Digest Fragments 

[0146] The digest-fragment libraries of each of the ten linking fragments from Example 9 were screened with sera 
from a human patient with chronic PT-NANBH and with pooled sera from chimpanzees with acute PT-NANBH infection 
and chronic PT-NANBH infection. Individual chronic and acute chimpanzee sera from 5 chimpanzees were obtained 
55 from the Centers for Disease Control. 

[0147] The digest-fragment libraries from the linking fragments indicated in Table 4 below were screened with each 
of the three sera, using the screening procedure described in Example 10. The total number of positive areas observed 
in each plate (making up one fragment library) is given in the table. The entries in the table which are not in parentheses 
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represent the number of positive areas which were confined by plaque purification, i.e., by replating plagues from the 
positive areas at low dilution and confining a positive area (secondary screen). Typically about 90-95 percent of the pos- 
itive areas in the primary screen tested positive by secondary screening. The entries in parentheses indicate positive 
areas which have not been confirmed in a secondary screen. 

[0148] As seen from Table 4, all but one of the linking fragment libraries contained sequences encoding peptide 
antigens which are immunoreactive with either chronic human or chimpanzee infected sera. Five of the libraries contain 
sequences encoding antigens which are immunoreactive with acute sera, indicating that one or more of the antigens in 
this group are effective to detect acute- infection serum. Three of these latter libraries - F 3 /R 4 , F 12 /R 7 , and F 7 /R 8 -gave 
over 10 positives in each library. These data are not corrected for the recombinant frequency in a particular library and 
therefore do not reflect the comparative immunogenicity of the various linking fragments. 



Table 4 



Human P.P. Clones 


Acute Pool P.P. Clones 


Chronic Pool P.P. Clones 


F1R10 


0 


0 


0 


F10R2 


4 


2 


4 


F2R3 


4 


0 


1 


F3R4 


0 


10 


10 


F4R5 


5 


0 


7 


F5R6 


34 


0 


(42) 


F6R12 


(400) 


5 


10(200) 


F12R7 


2 


17(200) 


9(200) 


F7R8 


0 


20 


10(130) 


F8R9 


60 


0 


1 


( ) = not plaque purified 

P.P. = Plaque Pure 

Acute Pool = CDC Panel of Chimps 

Chronic Pool = CDC Panel of Chimps 



Example 12 

Immunoscreenina for 409-1-1 -Antigen 
A. Plaque lmmunoscreening 

[0149] Several clear plaques identified in the primary screen of the F4/R5 linking fragment were replated and plaque 
purified. One of the purified plaques was designated gt1 1/409-1 -1(c-a). The digest fragment contained in clone 409-1- 
1(c-a) corresponds to two sets of base pairs present in the HCV genome and present in done 409-1 -1(abc). For ease 
of reference three regions (a, b, and c) have been designated in the 409-1-1 (abc) clone (see below and Figure 5). The 
longest homology of base pairs corresponds approximately to nucleotides 2754 to 31 29 of the Appendix (the "a" region, 
see Figure 5, region delineated by boxes) and the shorter homology corresponds approximately to nucleotides 3242 to 
331 1 of the Appendix (the "c" region, see Figure 5): normally the "c" region is located approximately 1 12 nucleotides 
distal the 3* end of the "a" region (see Figure 5). The complete sequence of the gt1 1/409-1 -1 (c-a) insert is given in Fig- 
ure 6 and presented as SEQ ID NO:7. This clone arose through a ligation event between two independent DNasel frag- 
ments generated from the F^Rg linking clone and has ATCC No. 40792. A related clone, designated 409-1-1 (abc), has 
been described in co-owned patent application Ser. No. 505,61 1 and has ATCC No. 40876. 

[0150] A lambda gt11 clone corresponding to the immunoreactive sequence reported in the EPO application 
88310922.5, and designated 5-1-1, was prepared by primer-specific amplification of the amplified cDNA fragments 
generated in Example 7. The 5-1-1 sequence corresponds to basepairs 3730-3858 of the HCV sequence (Appendix), 
in the linking fragment Fg/R^ The primers used for fragment amplification are 20 basepair oligomers complementary to 
the forward and reverse sequences of the 3732-3857 basepair 5-1 -1 sequence. Both oligomers have EcoRI sites incor- 
porated into their ends and the forward oligomer is designed to ensure a contiguous open reading fram with the beta- 
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galactosidase gene. The amplified 5-1*1 sequence was purified by agarose gel electrophoresis, and cloned into lambda 
gt11 phage. Amplification and cloning methods were as described above. Phage containing the 5-1-1 sequence were 
identified and purified by primary and secondary screening, respectively, with human PT-NANBH serum, also as 
described above. 

s [0151] The purified gtl 1/409-1 -1(c-a) and gtl 1/5-1-1 clones were each mixed with negative lambda gt11 phage, 
plated and immunoscreened with a number of different donor sera from normal and NANBH-infected humans and chim- 
panzees, as indicated in Table 5 below. Each plate was divided into several equal-area sections, and the corresponding 
sections on the nitrocellulose transfer filter were separately screened with the donor sera indicated, using the immuno- 
screening method described in Example 1 1 . The number of positives detected for each group of sera by the 5-1 -1 and 

10 409-1-1 (c-a) peptides are shown, as well as a comparison with the C-100 test in the ELISA format, in Table 5. 



Table 5 



Source 


Diagnosis 


# Donors 


# Positive 








5-1-1 


409-1-1 fc-a) 


C-100 


Human 


Normal 


2 


0 


0 


NT 


Human 


ANAB 


6 


4 


5 


0/1 * 


Chimp 


Normal 


7 


0 


0 


0/5 


Chimp 


Acute 


5 


1 


3 


0/5 


Chimp 


Chronic 


8 


7 


7 


5/5 



NT t not tested; 

" only BV serum was tested; N/S means N positives out of five sera tested. 



B. Western Blot Screening 

30 [0152] For Western blot screening, gtl 1/409-1-1 (c-a) phage from Example 1 1 was used to infect E. coii BNN103 
temperature-sensitive bacteria. These bacteria were obtained from the American Type Culture Collection. The bacterial 
host allows expression of a beta-galactosidase/jpeptide antigen fused protein encoded by the vector under temperature 
induction conditions (Hunyh). 

[0153] Infected bacteria were streaked, grown at 32°C overnight or until colonies were apparent, and individual col- 
35 onies were replica plated and examined for growth at 32°C and 42°C. Bacterial colonies which grew at 32°C, but not 
42°C, indicating integration of the phage genome, were used to inoculate 1 ml of NZYDT (Maniatis) broth A saturated 
overnight bacterial culture was used to inoculate a 10 ml culture, which was incubated with aeration to an O.D. of about 
.2 to .4, typically requiring 1 hour incubation. The culture was then brought to 43°C quickly in a 43°C water bath and 
shaken for 15 minutes to induce lambda gtH peptide synthesis, and incubated further at 37°C for 1 hour. 
40 [0154] The cells were pelleted by centrifugation, and 1 ml of the pelleted material was resuspended in 100 uJ of 
lysis buffer (62 mM Tris, pH 7.5 containing 5% mercaptoethanol, 2.4 % SDS and 10% glycerol). Aliquots (about 15 uJ) 
were loaded directly onto gels and fractionated by SDS-PAGE. After electrophoresis, the fractionated bands were trans- 
ferred by electrocution to nitrocellulose fitters, according to known methods (Ausubel et al.). 

[0155] The lysate was treated with DNasel to digest bacterial DNA, as evidenced by a gradual loss of viscosity in 
45 the lysate. An aliquot of the material was diluted with Triton X-100™ and sodium dodecyl sulfate (SDS) to a final con- 
centration of 2% Triton X-100™ and 0.5% SDS. Non-solubilized material was removed by centrifugation and the super- 
natant was fractionated by SDS polyacrylamide electrophoresis (SDS-PAGE).PAGE, A portion of the gel was stained, 
to identify the peptide antigen of interest, and the corresponding unstained band was transferred onto a nitrocellulose 
filter. 

so [0156] The 5-1 -1 antigen coding sequence (Example 1 1) was also expressed as a glutathione-S-transferase fusion 
protein using the pGEX vector system, according to published methods (Smith). The fusion protein obtained from bac- 
terial lysate and fractionated by SDS-PAGE were transferred to a nitrocellulose filter for Western blotting, as above. 
[0157] Western blotting was carried out substantially as described in Example 10. Briefly, the filters were blocked 
with AIB, then reacted with the serum samples identified in Table 5, including human and chimpanzee normal, chronic 

55 NANBH, and hepatitis B (HBV) sera sample. The presence of specific antibody binding to the nitrocellulose filters was 
assayed by further immunobinding of alkaline-phosphatase labelled anti-human IgG. The results of the Western blot 
analysis with the Sj26/5-1 -1 fusion protein and /409-1 -1 (c-a) fusion proteins are shown in Table 6. The data confirm that 
409-1-1 (c-a) and 5-1-1 peptide antigens are specifically immunoreactive with human and chimpanzee NANBH antisera. 
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Table 6 



Source 


Diagnosis 


# Donors 


# Positive 








Sj26 5-1-1 


p-gal 409-1 -1(c-a) 


Human 


Normal 


2 


0 


0 


Human 


NANB 


7 


5 


5 


Human 


HBV 


1 


0 


0 


Chimp 


Normal 


5 


0 


0 


Chimp 


NANB 


6 


5 


5 


Chimp 


HBV 


1 


0 


0 



Example 13 

Generation of Alternative Clones 

[0158] Alternative clones were generated from the region identified in Example 12 as encoding antigen specifically 
immunoreactive with human and chimpanzee NANBH antisera. The primers shown in Table 7 were selected from the 
HCV or 409-1-1 (abc) coding sequences to generate a variety of overlapping clones. 



Table 7 


Primer 


Sequence 


33C-F1 


CCGAATTCGCGGTGGACTTTATCCCTGT 


33C-R1 


CCGAATTCCAGAGCAACCTCCTCGATG 


409-1 -1(c-a)F 


CCGAATTCCGCACGCCCGCCGAGACTAC 


409-1 -1-F1 


CCGAATTCTCCACCACCGGAGAGATCCC 


409-1 -1-R2 


CCGAATTCCACACGTATTGCAGTCTATC 


409-1 -1-F3 


CCGAATTCGTCACCCAGACAGTCGAT 


409-1 -1-R5 


CCGAATTCCCCTCCCAAAATTCAAGATGG 


409-1 -1(c-a)R 


CCGAATTCGCCAGTCCTGCCCCGACGTT 


409-1 -1CR 


CCGAATTCGTCCTGGCACACGGGAAG 



[0159] The primers shown in Table 7 were used in DNA amplification reactions as described in Examples 7B and 
8: the primers and templates used in each reaction are shown in Table 8. The amplified fragments were then treated 
with the Klenow fragment of DNA polymerase I, under standard conditions (Maniatis et al.), to fill in the ends of the mol- 
ecules. The blunt-end amplified fragments were digested with EcoRI under standard conditions and cloned into lambda 
gt1 1 expression vectors essentially as described in Example 10B. The resulting inserts are aligned for comparison in 
Figure 7. 
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Table 8 



Generated Fragment 


Template 


Primers 




— OKI A * 

CD IMA 


33-Orl ana 409-1 -1-R2 


33CU 


cDNA* 


33-C-F1 and 32-C-R1 


409-1-1 (F1R2) 


gt11 409-1 -1(c-a) 


409-1 -1-F1 and 409-1 -1-R2 


409-1-1 (a) 


gt11 409-1 -1(c-a) 


409-1 -1-F1 and 409-1 -1caR 


409-1-1 (c) 


gt11 409-1 -1(c-a) 


409-1-1caF and 409-1 -1CR 


409-1-1(0+270) 


gt11 409-1 -1(c-a) 


409-1 -1caF and 409-1 -1-R2 


409-1-1 u 


gt11 409-1 -1(c-a) 


409-1 -1-F3 and 409-1 -1caR 



* Amplified cDNA fragments from Example 7 

Example 13 

20 Immunoscreenina of the Alternative Clones 

[01 60] The alternative clones generated in Example 1 2 were immunoscreened essentially as described in Example 
10B. Clones 409-1-1(abc) and 409-1-1(c-a), generated in Example 12, were also included in the following immuno- 
screenings. The results of the preliminary immunoscreening are shown in Table 9. 

25 

. - Table 9 





GLI-1 


FEC 


33C 


+ 


ND* 


33cu 


+ 


ND 


409-1-1 (abc) 


+ 


ND 


409-1-1 (F1R2) 


+ 


ND 


409-1-1 (a) 


+ 


ND 


409-1-1 (ca) 


+ 


ND 


409-1-1 (C) 






409-1-1 (c+270) 


+ 


ND 


409-1-1 u 







*Not Done 



45 [0161] The GLI-1 sera was a human chronic PT-NANBH sera. If a clone tested negative wfth GLI-1 it was further 
examined by screening with FEC, a human chronic PT-NANBH sera. 

[0162] The seven of the 9 alternative clones which tested positive by the above preliminary immunoscreening were 
more extensively screened against a battery of sera. In addition, clone C100 (see Background) was included in the 
screening. The results of this more exhaustive screening are presented in Table 10. 

so 
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Table 10 



Serum ANTIGEN 



5 



10 



15 



25 



35 











409-1-1 


409- 1-1 


ArtQ. 1.1 

**V5J • 1 * 1 






- 1 c 




ClOO 


33C 


33Cu 


a be 


FIR2 


a 


c + 270 


ca 




SKF(-) 




- 


- 


- 


- 




- 


- 


- 


FEC< + ) 


+ 


+ 3 


+ 3 


+ 1 


+ 2 


+ 2 


- 


+ 2 


BV 


• 


+ 2 


+ 3 


I 


+ 1 


+ 1 


• 


+ 1 


- 


Bar 




+ 2 


+ 2 


1 


• 


- 


• 


- 


- 


PP(-> 




- 


• 


- 




- 


- 


- 




AP 


- 


+ 1 


+ 2 


- 


1 


- 


- 


I 




CP 


+ 


+ 2 


+ 3 


+ 2 




+ 3 


1 


+ 3 


+ 2 


1 
2 




* 


- 


- 


* 




- 


- 


- 


3 




- 


- 












\ 


4 




- 


• 


1 






- 


- 


1 


5 




- 


+ 1 


- 


- 




• 


- 


- 


6 




+ 1 


+ 3 


+ 1 


+ 1 


+ 1 




+ 1 


+ 1 


7 




+ 2 


+ 3 


+ 1 


-2 


^2 


• 


+ 2 


+ 1 


3d 




- 


1 


+ 1 


1 


1 






1 


39 




• 


+ 1 


1 


- 1 






1 


1 


40 




+ 1 


+ 2 


+ 1 


- 1 






+ 1 


+ 1 


41 




+ 2 


+ 3 


+ 1 


+ 1 


* 1 




+ 2 


+ 1 


42 




+ 2 


+ 3 


+ 1 


- 1 


+ 1 




+ 2 


+ 1 


43 
44 




1 


1 














45 




1 


+ 1 


1 


1 






1 


1 


46 


+ 


+ 1 


+ 2 


+ 1 


-2 


+ 1 




+ 1 


1 


47 




+ 1 


+ 2 


+ 2 


•2 


t3 


m 


+3 


+1 


aia 




+3 


+3 


+ 1 


+ 3 


+3 




+3 




A7 




+3 


+3 


+ 1 


+ 1 


+3 




+3 


+3 


C7 




+2 


+3 














A3 




+3 


+3. 


+ 1 


+2 


+ 1 




+2 




B7 




+2 


+3~ 


I 


+3 


+3 




+ 3 


I 


C12 




+2 


+3 















40 



[0163] The serum samples used for screening were identified as follows: SKF, PT-NANBH negative; FEC, PT- 
NANBH positive; BV, community acquired NANBH; Bar, PT-NANBH positive; PP (pre-inoculation pooled chimpanzee 
serum), PT-NANBH negative; AP (acute HCV pooled chimpanzee serum), PT-NANBH positive; and, CP (chronic HCV 

45 pooled chimpanzee serum) PT-NANBH positive. The numbered serum samples correspond to human clinical serum 
samples which were PT-NANBH positive. The PP, CP, and AP sera were pooled sera samples from 5 different chimpan- 
zees: the chimpanzee serum samples were obtained from the Centers for Disease Control. The scoring system pre- 
sented in Table 10 is a qualitative scoring system defined as follows: (-), a clear negative; (+), (1+), (2+), (3+), increasing 
strength of positive signal, with (3+) being the strongest signal; and (I) stands for Indeterminate, where two readings 

so were different and not repeated. 

[01 64] In view of the data presented in Table 1 0 the sensitivity of the antigens in terms of immunoscreening is 33cu 
> 33c> 409-1-1(c-a) > 409-1 -1-F1R2 > 409-1-1(abc) £ 409-1-1a > 5-1-1 > 409-1-1 -(c+270). Although 33cu and 33c 
were sensitive antigens, they reacted with high background against all sera. Accordingly, the 409-1-1 series are more 
useful as diagnostic antigens since they are more specific to HCV induced antibodies. 

55 [0165] The immunoscreening was further extended to include the clone 36 and 45 (corresponds to clone 40) 
encoded epitopes which were identified above. Table 1 1 shows the results of the immunoscreening. 
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Table 11 

PANEL I: SEROCONVERSION SPECIMENS 



SERUM ANTIGEN 

C-100 33C 5.1.1 409-1-1 36 45 gtll 

fc-a) 

10 GLI-1 + 4+ 2 + 4+ - 3 + 

FEC + 4+ 3+ 4+ 3 + 



BV -3+ 3 + 

SKF(norm) - 

1- N01/D69 I 

2- "/D124 + 



20 



25 



30 



35 



40 



45 



50 



55 
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3- "/D146 I • _ 

4- "/D211 + 

5 5-N00/D22 + J j 

6- "/D29 2 + + 2 + 

7- »/D41 - 3+ 2+ 3+ 

8- "/D60 - 4+ 3 + 4+ 
io 9- "/D137 + 4+ 4+ 4 + 

10- N240/D0 I _ j 

11- M /D45 - 

12- "/D71 I - j 

13- H /D89 - I 

14- "/D106 I 

15- M /D155 - I 



is 



20 



25 



40 



45 



SO 



16- N228/D0 - I 

17- M /D31 - I 

18- "/D41 - I 

19- "/D51 - I 

20- "/D73 - I 

21- "/D93 

22- "/D127 - 



23- N192/D114 

24- "/D184 - 
so 25- "/D224 - 

26- "/D280 - 

27- N176/D0 

28- "/D66 
as 29- "/D77 

30- "/D94 

31- »/D200 - 



32- N170/D0 

33- M /D27 

34- "/D49 

35- M /D64 

36- "/D183 - 

37- "/D278 - 
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SERUM ANTIGEN 



10 



15 



20 



25 



30 



35 



40 







C-100 


33C 


5.1.1 

f P 


409-1 

a) 


-i 


36 


w q n X •» *• / !«/ w 




T 
X 
















T 

X 












40- »/D91 


+ 


2 + 


+ 


2+ 








41- M /D289 


+ 


4+ 


+ 


3+ 


2 + 


- 


- 


*# * — / jj^ j j 






3 + 


4+ 


2 + 






4 j "n x^ £/uu 




T 












ti /nci 






1 


X 








4 D" / UD / 




2 + 


X 


+ 












2 + 




3 + 


I 






A "7 _ 11 /no/ 




3 + 


+ 


4 + 


+ . 






48- «'/D199 


+ 


4 + 


2 + 


4 + 


+ 




I 


** / uu 




1 










^™ 


DO** / UJL4 O 
















51™ /UX54 






* * 






— * 




52 — /Dl/Q 








— 








DJ — /IJZXO 
















j4* / D2 bo 












— 




3D " / U J J O 
































57-N16/D0 
















58- 'VD47 
















59- "/D62 
















60- n /D83 
















61- "/D137 
















61- "/D167 
















63- M /D197 
















64- "/D370 

















[0166] The screening sera GLI-1 , FEC, BV, and SKF have been defined above. The numbered sera samples cor- 
45 respond to human clinical serum samples which were PT-NANBH positive: these samples were obtained from Dr. Fran- 
coise Fabiani-Lunel, Hospital La Prtie Salpetriere, Paris, France. As can be seen from the results presented in Table 1 1 , 
the antigens produced by clones 36 and 40, while not as sensitive as 409-1 -1 (c-a), do yield HCV-specific immunopos- 
itive signals. 

so Example 14 

Isolation of 409-1-1 Fusion Protein 

[0167] Sepharose 4B beads conjugated with anti-beta galactosidase were purchased from Promega. The beads 
55 were packed in 2 ml column and washed successively with phosphate-buffered saline with 0.02% sodium azide and 10 
ml TX buffer (10 mM Tris buffer, pH 7.4, 1% aprotinin). 

[0168] BNN103 lysogerts infected with gt1 1/409-1-1 (c-a) from Example 12 were used to inoculate 500 ml of 
NZYDT broth. The culture was incubated at 32°C with aeration to an O.D. of about .2 to .4, then brought to 43°C quickly 
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in a 43°C water bath for 1 5 minutes to induce gt1 1 peptide synthesis, and incubated further at 37°C for 1 hour. The cells 
were pelleted by centrifugation, suspended in 10 ml of lysis buffer (10 mM Tris, pH 7.4 containing 2% Triton X-100™ and 
1% aprotinin added just before use. The resuspended cells were frozen in liquid nitrogen, then thawed, resulting in sub- 
stantially complete cell lysis. The lysate was treated with DNasel to digest bacterial and phage DNA, as evidenced by 

5 a gradual loss of viscosity in the lysate. Non-solubilized material was removed by centrifugation. 

[01 69] The clarified lysate material was loaded on the Sepharose column, the ends of the column were closed, and 
the column was placed on a rotary shaker for 2 hrs. at room temperature and 16 hours at 4°C. After the column settled, 
it was washed with 10 ml of TX buffer. The fused protein was eluted with 0.1 M carbonate/bicarbonate buffer, pH10. A 
total of 14 ml of the elution buffer was passed through the column, and the fusion protein eluted in the first 4-6 ml of 

10 eluate. 

[0170] The first 6 ml of eluate from the affinity column were concentrated in Centricon™-30 cartridges (Amicon, 
Danvers, Mass.). The final protein concentrate was resuspended in 400 jil PBS buffer. Protein purity was analyzed by 
SDS-PAGE. A single prominent band was observed. 

is Example 15 

Preparation of Anti-409-1-1fc-a) Antibody 

[0171] The 409-1 -1(c-a) digest fragments from lambda gt1 1 were released by EcoRI digestion of the phage, and 
20 the "A" region purified by gel electrophoresis. The purified fragment was introduced into the pGEX expression vector 
(Smith). Expression of glutathione S-transferase fused protein (Sj26 fused protein) containing the 409- 1-1 (a) peptide 
antigen was achieved in E. colt strain KM392 (above). The fusion protein was isolated from lysed bacteria, and isolated 
by affinity chromatography on a column packed with glutathione-conjugated beads, according to published methods 
(Smith). 

25 [0172] The purified Sj26/409-1-1(a) fused protein was injected subcutaneously in Freund's adjuvant in a rabbit. 
Approximately 1 mg of fused protein was injected at days 0 and 21 , and rabbit serum was collected on days 42 and 56. 
[0173] A purified Sj26/5-1-1 fused protein was similarly prepared using the an amplified HCV fragment encoding 
the 5-1 -1 fragment. The fused Sj26/5-1-1 protein was used to immunize a second rabbit, following the same immuniza- 
tion schedule. A third rabbit was similarly immunized with purified Sj26 protein obtained from control bacterial lysate. 

30 [0174] Minilysates from the following bacterial cultures were prepared as described in Example 12: (1 ) KM392 cells 
infected with pGEX, pGEX containing the 5-1-1 insert, and pGEX containing the 409-1 -1(a) insert; and (2) BNN103 
infected with lambda gt1 1 containing the 5-1 -1 insert and gt1 1 containing the 409-1 -1 (c-a) insert. The minilysates were 
fractionated by SDS-PAGE, and the bands transferred to nitrocellulose filters for Western blotting as described in Exam- 
ple 12. Table 12 shows the pattern of immunoreaction which was observed when the five lysate preparations (contain- 

35 ing the antigens shown at the left in the table) were screened with each of the three rabbit immune sera. Summarizing 
the results, serum from control (SJ26) rabbits was immunoreactive with each of the Sj26 and Sj26 fused protein anti- 
gens. Serum from the animal immunized with Sj26/5-1-1 fused protein was reactive with all three Sj-26 antigens and 
with the beta-gal/5-1 -1 fusion protein, indicating the presence of specific immunoreaction with the 5-1-1 antigen. Serum 
from the animal immunized with Sj26/409-1-1(a) fused protein was reactive with all three Sj-26 antigens and with the 

40 beta-gal/409-1-1(c-a) fusion protein, indicating the presence of specific immunoreaction with the 409-1-1(a) antigen. 
None of the sera were immunoreactive with beta-galactosidase (obtained from a commercial source). 



Table 12 



Antigens 


Antibody 




Sj26 


5-1-1/Sj26 


409-1 -1(a)/Sj26 


Sj26 


+ 


+ 


+ 


5-1-1/(Sj26) 


+ 


+ 


+ 


5-1-1/(0-bal) 




+ 




409-1 -1(a) (Sj26) 


+ 


+ 


+ 


409-1-1 (c-a) (p-gal) 






+ 



[0175] Anti-409-1-1(a) antibody present in the sera from the animal immunized with the Sj26/409-1-1(a) is purified 
by affinity chromatography, following the general procedures described in Example 12, but where the ligand derivatized 



30 



EP1 018 558 A2 



to the Sepharose beads is the purified beta-gal/409-1 -1 (c-a) fusion protein, rather than the anti-beta-galactosidase anti- 
body. 

Example 16 

5 

Cloning the HCV Capsid Protein Coding Seauences 

[0176] The example describes the cloning of HCV coding sequences which encodes the N-terminal region of the 
HCV capsid protein. 

io [0177] The protein sequence of the HCV-capsid associated antigen corresponds to the nucleotide residues 325- 
970 of the full length HCV sequence (see Appendix A). The following sequences were used as PCR primers to clone 
this region: SF2(C), 5' end starting at nucleotide 325 of the full length HCV sequence (Appendix), S'-GCGCCCAT- 
GGGCACG-ATTCCCAAACCTCA; and SR1(C), 3' end starting at nucleotide 969 of the full length HCV sequence 
(Appendix), 5'-GCCGG-ATCCCTATTACTC(G/A)TACACAAT(A/G)CT(C/T)GAGTT(A/G)G. The anticipated size of the 

75 fragment generated using the SF2(C)/SR1 (C) primer pair was 644 base pairs. 

[0178] SISPA-amplified cDNA fragments from Example 7 were mixed with 100 uJ Buffer A, 1 uM of equal molar 
amounts of each SR2 and SF1 primer given above, 200 each of dATP, dCTP, dGTP, and dTTP t and 2.5 units of Ther- 
mus aouaticus DNA polymerase (Taq polymerase), as in Example 8. 

[0179] Specific amplification of the SISPA-amplified cDNA fragments with the capsid primer pair given above was 

20 carried out under conditions similar to those described in Example 7, with 1 minute at 72°C and about 30 cycles. 

[0180] The amplified fragment mixtures from above were each fractionated by agarose gel electrophoresis on 
duplicate 1 .2% agarose gels, and one of the gels transferred to nitrocellulose filters (Southern) for hybridization with 
with a radioactively labelled oligonucleotide (Southern) having the following sequence: SF3(M/E), 5* end starting at 
nucleotide 792 of the full length HCV sequence (Appendix), 5'-GCGCCCATGGTTCTGGAAGACGGCGTG. This oligo- 

25 nucleotide corresponds to a sequence internal to the amplification product generated by using the SF2(C) and SR1 (C) 
primers. Eight out of 15 PCR products were identified which gave a positive hybridization signal with the internal probe. 
[0181] The vectors pGEX (Example 15) and pET (NOVAGEN, 565 Science Drive, Madison, Wl 5371 1) were cho- 
sen for bacterial expression of protein sequences encoded by the inserts. The pGEX vector provided expression of the 
inserted coding sequences as fusion proteins to Sj26 (see Examples 12 and 15) and the pET vector provided expres- 

30 sion of the cloned sequences alone. To clone the capsid sequences, the amplification product bands were excised from 
the duplicate gel. The DNA was extracted from the agarose and doubly-digested with Ncol and BamHL A pGEX vector 
containing the BamHI/Ncot cloning sites was also doubly digested with BamHl and Ncol, The vector and extracted 
DNA were then ligated under standard conditions and the ligation mixture transformed into bacterial cells. 
[0182] The bacterial transformants were cultured under ampicillin selection, and the plasmid DNA isolated by alka- 

35 line lysis (Maniatis et al.). The isolated plasmid DNA was digested with Ncol and BamHl. The digestion products were 
then electrophoretically separated on an agarose gel. The gel was transferred to nitrocellulose and probed with radio- 
actively labelled SF3 as above. Twelve clones were confirmed to have the insert of interest by the Southern blot analy- 
sis. 

[0183] Clones were generated in the pET vector in essentially the same manner. 

40 

Example 17 

Immunological Screening of the Putative HCV Capsid Protein Clones 

45 [0184] This example describes the immunological screening of the putative HCV capsid protein clones which were 
obtained in Example 18. 

[0185] Of the twelve clones obtained in Example 16, protein mini-lysates of 7 clones (clones #8, 14, 15, 56, 60, 65, 
and 66) were prepared as described in Example 12. These mini-lysates were fractionated as described and transferred 
to nitrocellulose for Western Blot analysis. Table 13 shows the pattern of immunoreaction which was observed when 
so the 7 lysate preparations were screened with the indicated sera. 



Table 13 



55 



Clone 


Sera 




SKF 


FEL 


A6 


B9 


BV 


8 
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Table 13 (continued) 



Clone 


Sera 




Ql/C 

Ol\r 


rtL 


AC 

Ab 


DO 


bv 


14 




+ 


+ 


+ 


+ 


15 




+ 




+ 




56 




+ 


+ 


+ 


+ 


60 




+ 


+ 


+ 


+ 


65 




+ 


+ 


+ 


+ 


SJ26 












5-1-1 




+ 


+ 


+ 




409-1-1 






+ 


+ 





[0186] The serum samples used for screening were identified as follows: SKF, HCV negative; FEC, HCV positive; 
BV, community acquired HCV; A6 and B9 correspond to human clinical serum samples which were HCV positive. 
[0187] Immunoreactive bands identified on the Western blot were all smaller than the expected size of 50 kd (based 
on the predicted coding sequence of the cloned inserts, see below). 

[0188] Clone 1 5 was chosen for scale-up production of the Sj26 fusion protein (Smith et al.). A one liter preparation 
of clone 15 yielded about 200 ng of purified immunoreactive material. The bulk of the immunoreactive material 
appeared in a major doublet band which ran at approximately 29 kd. The yield from this preparation was unexpectedly 
low: typically with the pGEX system a one liter protein preparation yields in the range of 50-100 mg fusion protein. 

Example 18 

Nucleic Acid Sequences of Clones 15 and 56 

[0189] The inserts of clones 15 and 56 (discussed in Example 17) were sequenced as per the manufacturer's 
instructions (US Biochemical Corporation, Cleveland OH) using the dideoxy chain termination technique (Sanger, 
1979). Each of the clones had an open reading frame contiguous with the Sj26 reading frame of the pGEX vector. The 
sequences of the clone inserts were near identical with only a few minor sequence variations: the sequence of clone 15 
had a termination codon starting at nucleotide position 126. The sequence data for clone 56 is presented as SEQ ID 
NO:1 1 and in Figure 8A. 

[0190] The sequencing of the inserts revealed the unusual feature of a run of adenine residues from nucleotide 
position 25 to position 34 (Figure 8A): such sequences are similar to sequences known to promote translation frame- 
shifting (Wilson et al., Atkins et al.). The open reading frame contiguous with the Sj26 coding sequence predicts a pro- 
tein of approximately 23.5 kd. Accordingly, given the approximately 26 kd size of the Sj26 protein fragment in this con- 
struct (Smith et al.), the complete fusion protein would be predicted to be approximately 50 kd. 

Example 19 

Hvdropathicitv Plot of the Protein Encoded bv Clone 56 

[0191] The SOAP program from IntelliGenetics PC/GENE™ software package was used to generate the hydro- 
pathicity plot of Figure 9. The SOAP program uses the method of Kyte et al. to plot the hydropathicity of the protein 
along its sequence. The interval used for the computation was 1 1 amino acids. In Figure 9, the hydrophobic side of the 
plot corresponds to the positive values range and the hydrophilic side to the negative values range. 
[0192] The hydopathicity plot indicates (i) the hydrophilic nature of the amino terminus of the capsid protein, (ii) the 
relatively hydrophobic nature of the region of amino acid residues approximately 122 to 162, and (iii) the hydrophobic 
nature of amino acid residues approximately 168-182. 

[0193] Further, the region of amino acid residues 1 68-182 demonstrates potential for being a membrane spanning 
segment (Klein et al.). 
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Example 20 

Deletion Analysis of the Clone 56 Protein Coding Region 

s [01 94] This example describes the generation of a series of carboxy and amino terminal deletions of the HCV cap- 
sid protein and the effect of these deletions on the immunoreactivity of the resulting proteins. 

A. Carboxy Terminal Deletions of Clone 56. 

10 [0195] As one step to improve the expression of the HCV capsid protein, the putative region of translational 
frameshifting was modified to reduce the probability of a frameshift occurring in this region. In each AAA codon, encod- 
ing lysine, (nucleotide positions 25 to 33, Figure 8A) the third nucleotide in each codon (positions 27, 30 and 33, Figure 
8A) was changed from A to G using standard PCR mismatch techniques (Ausubel et al., Mullis, Mullis et al.). The sites 
of these substitutions are indicated in Figure 8A by the three G's placed over the corresponding A's. The sequence of 

75 the modified pGEX clone was confirmed as in Example 1 9 and the clone was named pGEX-CapA. The insert sequence 
of clone pGEX-CapA is shown in Figure 8B and presented as SEQ ID NO: 13. 

[0196] The deletion clones were generated using the PCR primers given in Table 14. In Table 14 the BamHl site is 
italicized and the termination codon is underlined. 

20 

Table 14 



CARBOXY TERMINAL DELETION PRIMERS 


1. 


C1 


5'-CGA TCC ATG GGC ACG AAT CCT AAA CC 


2. 


NC580 


5'-G GCC GGA TCC TTA GGC CGA AGC GGG CAC AG 


3. 


NC520 


5'-G GCC GGA TCC TTA ACC AGG AAG GTT CCC TGT TGC 


4. 


NC450 


5'-G GCC GGA TCC TTA GGC CCT GGC ACG GCC TCC 


5. 


NC360 


5'-G GCC GGA TCC TTA CAA ATT GCG CGA CCT ACG CC 


6. 


NC270 


5'-G GCC GGA TCC H& GCC CTC ATT GCC ATA GAG 



[0197] Amplification reactions were carried out essentially as described in Example 16 using primer C1 paired with 
35 each of the NC primers and purified plasmid pGEX-CapA as template: the amplification reaction was 1 minute at 95°. 
annealed 2 minutes at 50° and 3 minutes at 72° for 20 cycles. 

[01 98] The following sequence comparisons are given relative to the nucleic acid sequence presented in Figure 8B. 
The C1 primer corresponds to the common 5' end of the pGEX-CapA insert which contains an Ncol site near the initi- 
ating methionine. The sequence of the NC primers each start at the nucleotide position indicated, for example, the 
40 homologous sequence of the NC580 primer ends at nucleotide position 580. A termination codon is inserted at that 
position, following a BamHl site. The positions of the primers given in Table 14 are indicated in Figure 8B. The approx- 
imate locations of the primers relative to the protein sequence are indicated in Figure 9. 

[0199] The resulting amplification products were electrophoretically size fractionated on a polyacrylamide gel and 
the DNA products of the appropriate sizes electrocuted from the gel. The amplification products were cloned into both 
45 the pGEX and the pET vectors for expression. The sequences of the inserts were confirmed as described in Example 
18. 

[0200] The pGEX vectors containing the carboxy-terminal deletions were transformed into E. cofi and the fusion 
proteins purified essentially as follows. Expression of the fusion protein was induced with IPTG for 3-4 hours. The cells 
were then harvested at 6.000 rpm for 10 minutes. The £. coli were then lysed in MTPBS buffer (150 mM NaCI; 16 mM 

so Na 2 HP0 4 ; 4 mM NaH 2 P0 4 , pH=£.0) after which 1% "TRITON X-100." 3 ng/ml DNase I, and 1 mM PMSF were added. 
The lysates were centrifuged at 15,000 rpm for 20 minutes. The supernatants were discarded and the pellets resus- 
pended in 8M urea The components of the resuspenion were separated by HPLC using a "BIO-GEL SP-5-PW col- 
umn. Typically, the fusion protein was the predominant peak: the location of the fusion protein was confirmed by 
Western Blot analysis. Clones C1NC270, C1NC360, and C1NC450 all expressed Sj26 fusion proteins at high levels: 

55 the fusion proteins all corresponded to the size predicted from the insert coding sequence fused to the Sj26 protein and 
were immunoreactive with HCV-positive sera (Western Blots were performed as described in Example 17). Although 
the supernatants were discarded substantial amounts of the fusion proteins were also present in the supernatants. 
Clones C1 NC520 and C1 NC580 gave poor yeilds of fusion proteins. 
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[0201] An epitope map of the HCV capsid region is presented in Figure 10: the location of the immunoreactrve pro- 
tein coding sequences corresponding to inserts C1 NC450, C1 NC360, and C1 NC270 are indicated. The sequences of 
C1NC450, C1NC360, and C1NC270 are presented in the Sequence Listing as SEQ ID NO:15. SEQ ID NO:17, and 
SEQ ID NO:19, respectively. 

B. Amino Terminal Deletions of Clone 56. 

[0202] Amino terminal deletion clones were generated using the PCR primers given in Table 15. 



Table 15 


AMINO TERMINAL DELETION PRIMERS 


1. 


C1 00 GAG CCC ATG GGT GGA GTT TAG TTG TTG CC 


2. 


C270 GAG CCC ATG GGC TGC GGG TGG GCG GG 


3. 


C360 GAG CCC ATG GGT AAG GTC ATC GAT ACC 



[0203] Amplification reactions were carried out essentially as described above using the primer pairs presented in 
Table 16 and purified plasmid pGEX-CapA as template: the amplification reaction included was 1 minute at 95°, 
annealed 2 minutes at 50°, and 3 minutes at 72° for 20 cycles. 



Table 16 



NH 2 Primer 


COOH Primer 


Protein Produced? 


Immunoreactive? 


C100 


NC450 


LOW 


YES 




NC360 


YES 


YES 




NC270 


YES 


YES 


C270 


NC450 


YES 


NO 




NC360 


YES 


NO 


C360 


NC450 


YES 


NO 



[0204] The following sequence comparison are given relative to the nucleic acid sequence presented in Figure 8B 
where the above described A to G substitutions have been made for the sequence of pGEX-CapA. The NC660 primer 
corresponds to the common 3' end of the pGEX-CapA insert which contains a BamHI site near the end of the insert. 
The sequence of the C primers each start at the nucleotide position indicated, for example, the sequence of the NC100 
primer begins at nucleotide position 100. Each of the C primers introduces an in-frame initiation codon in the resulting 
amplification product. The positions of the primers given in Table 15 are indicated in Figure 8B. 
[0205] The resulting amplification products were cloned into the pGEX and pET vector for expression as described 
above. The sequences of the inserts were confirmed. 

[0206] The pGEX vectors containing the carboxy-terminal deletions were transformed into £. co//, protein mini- 
lysates prepared, and the immunoreactivity of the proteins analyzed by Western Blots as described above. The results 
of the analysis are presented in Table 16. Clones C100NC270 and C100NC360 expressed Sj26 fusion proteins at high 
levels: the fusion proteins corresponded to the size predicted from the insert coding sequence fused to the Sj26 protein. 
[0207] An epitope map of the HCV capsid region is presented in Figure 10: the location of the protein coding 
sequences corresponding to inserts C100NC270, C100NC360, C270NC360, and C270NC450 are indicated. The 
sequences for C100NC270 and C100NC360 are presented in the Sequence Listing as SEQ ID NO:21 and SEQ ID 
NO:23, respectively. 

Example 21 

Expanded Immunoscreenino Using the Caosid Antigen 

[0208] This example describes three different comparisons of the immunoreactivity of the various HCV antigens of 
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the present invention to several battery of sera. 
A. Effectiveness of Cap450 Antigen. 

5 [0209] Table 17 shows the results of 50 human sera samples from patients suspected of NANB hepatitis infection. 
The ELISA assays were performed essentially as described by Tijssen using the following 3 antigens: C100, 409-1 -1(c- 
a), C33u, Cap450 (the protein product of the pGEX-C1NC450 clone), and with 409-1-1(c-a) and cap4150 in one well 
which was optimized to give the most sensitive results. These ELISA data were compared with the Abbott C100 test. 
[0210] Patient serum was scored positive for Sj26 fusion proteins (409-1-1 ca, 33u, 5-1-1, and Cap450) if the 

10 absorbance was three times the absorbance of that serum on Sj26 native protein. A sample was scored positive on pET 
antigens (cap360) if the absorbance was three times the mean of the absorbance of negative control sera. A patient 
serum was scored positive on the combined 409-1-1 ca/cap450 assay if the absorbance was equivalent or greater than 
that of control positive sera. Samples within 10% of the control positive sera were scored weak positives. 

15 [Samples 1 -19: Chronic active hepatitis proven by biopsy. HBS Ag(-). 

Samples 20-44: Acute viral heprtitis HBsAg(1), ISM Anti-HBC(-), IgM anti-HAV(-). 
Samples 45-50: Chronic active hepatitis proven by biopsy. HBsAg(-). 



20 Table 17 

Korean Panel II 



30 





Sample 


ClOO 


409-1-1 
(c-a) 


C33u 


Cap 
450 


Combined 
- 409-1-1 (c-a) 
+CAP4S0 


1 






+ 




4- 


+ 


2 






+ 




+ " 




3 




+ 


+ 








4 






+ 








5 














6 






+ 






+ 


7 






+ 






♦ 


8 












• 4- 
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9 




+ 






+ 


-4- 


5 


10 










* 






IX 




+ 




+ 


* 


+ 




12 








+ 








13 




+ 




+ 






10 


14 






























15 




♦ 


+ 










16 








+ 


* 


+ 


15 


17 
















18 




+ 












19 




+ 












20 


945 












20 


21 


988 








* 






22 


3383 














23 


4072 












25 I 


24 


4242 














25 


4490 













* » positive (low) 

30 



35 



40 



45 



50 
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10 



20 



25 



30 



35 





Sample 
# 


C100 


409-1-1 
(c-a) 


C33u 


Cap 
450 


Combined 
409-1-1 (c-a) 
+CAP450 


26 


4816 


- 


- 


- 


- 


- 


27 


5322 


- 


• 


- 


- 


- 


28 


6603 


- 


- 


- 


- 


- 


29 


7923 


- 


- 


- 


- 


- 


30 


9033 


- 


- 


- 


- 


- 


31 


9768 


- 


- 


- 


- 


- 


32 


9775 


- 


- 


- 


- 


- 


33 


10197 




- 


- 






34 


10200 


- 


- 


- 


- 


- 


35 


10409 


- 


- 


- 


- 


- 


36 


10811 


- 


- 


- 


- 


- 


37 


11209 


- 








. NO 


38 


12245 


- 


- 


- 


- 


- 


39 


12143 


- 


- 


- 


- 




40 


12519 




- 


- 


- 


- 


41 


13510 


- 


- 


- 


- 


- 


42 


14018 


- 


- 


- 


- 


- 


43 


14188 


- 


- 


- 




- 


44 


13437 


- 


- 


- 


- 


- 


45 


863 












46 


3354 ' 












47 


12640 












48 


13095 




• 


4- 






49 


14501 












50 


14345 






+ 


+ 





* a positive (low) 



[021 1 ] The results demonstrate that the Cap450 protein has good sensitivity for detecting the presence of anti-HCV 
antibodies in sera samples. Three additional samples (6, 37, and 47) were detected. Further, these results indicate that 
the combination of Cap450 and 409-1 -1 (c-a) can be used to produce a kit which is very effective for detection of anti- 
HCV antibodies in human sera samples. 

50 

B. Cap450 and Cap360. 



[0212] The results in Table 18 demonstrate the effectiveness of the Cap450 and Cap360 antigen (the protein prod- 
uct encoded by of pET-C1 NC360) to detect HCV antibodies present in human sera. The samples were tested for the 
55 presence of HCV by ELISA using each individual antigen shown, or with 409-1-1 (c-a) and Cap450 antigens combined 
in one well. 
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Table 18 



5 


SERUM 


PATIENT DIAGNOSIS 


C100 


ELISA 
5-1-1 


409-1-1 
(c-a) 


C33u 


Cap 
360 


Combined 
(409-1-1) 
+Cap450 


10 


G-131 


Acute Hepatitis? Pt "CO, " 


- 


- 


- 


- 


- 


- 




G-132 


Acute Hepatitis; Pt "CO. * 
















G-143 


Acute Hepatitis; Pt "CO." 


- 


- 


- 


- 


- 


- 




G-285 


Acute Hepatitis; Pt "CO** 


ND 


ND 


ND 


ND 


ND 


— 


15 


G-150 


Acute P»T. Hepatitis; Pt 
"C.L." 


- 


- 


I 


I 


+ 


+ 




G-151 


Acute P.T. Hepatitis; Pt 
-G.L. - 


- 


- 


I 


- 


+ 




20 


G-152 


Acute P.T. Hepatitis; Pt 


- 


- 


- 


- 


+ 






G-1S3 


Acute P.T. Hepatitis; Pt 
"G.L. w 


- 


- 


I 


- 






25 


G-286 


Acute P.T. Hepatitis; Pt 
"G.L. " 


ND 


ND 


ND 


ND 


ND 






G-43 


Fulminant Liver Disease 


— 


— 


— 


— 


— 


— 


30 


G-l 




Community Acquired 
Heoatitis 


ND 


I 












G-109 


Community Acquired 
Hepatitis 


+ 












35 


G-114 


Community Acquired 
Hepatitis 


ND 














G-128 


Community Acquired 
Hepatitis 


+ 




I 


+ 




+ 


40 


G-3 


Community Acquired 
Hepatitis 















45 



50 
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5 


CCT5TTVI 




CXUu 


ELI 5 A 
5—1—1 


409-1-1 
(e-a) 


UJ JU 


Cap 

JfiO 


Combined 
{409-1-1} 
+Cap450 




G-126 


Community Acquired 
Hepatitis 














10 


G-127 


Community Acquired 
Hepatitis 




I 










G-42 


Idiopath. Comm. Ac. 
Hepatitis 
















G-51 


Community Acquired 
Hepatitis B 














15 


G-27 


Community Acquired 
Hepatitis B 
















G-22' 


Community Acquired 
Hepatitis B 




— 


— 






z — 


20 


G-40 


Community Acquired 
Henatitis B 


- 


- 


- 


- 


- 






G-31 


Community Acquired 
Hepatitis B 














25 


G-45 


Community Acquired 
Hepatitis B 


- 


- 


- 


- 


• - 


- 




G-38 


Fulminant Hepatitis B 


- 


- 


- 


- 


- 


- 


30 


G-41 


Community Acquired 
Hepatitis C 






I 




I 






G-13 


Hepatitis c 


+ 


I 


+ 




♦ 






G-12 


Hepatitis C 






-4- 




♦ 




35 


G-6 


Hepatitis C 
















G-49 


EtOH Cirrhosis 
















G-25 


EtOH Cirrhosis 














An 


G-110 


EtOH Cirrhosis 










I 






G-46 


EtOH Cirrhosis 
















G-272 


Infant Liver Transplant 


ND 












45 


G-274 


Infant Liver Transplant 


ND 


• 


♦ 




- 


- 


G-16 


PBC 
















G-123 


INC LT 
















G-122 


INC LT 




+ 










SO 


G-125 


No Diagnosis 




I 






I 






G-124 


Mo Diagnosis 















55 

[021 3] These results indicate that the combination of antigen 409-1 -1 (c-a) and Cap360 or Cap450 result in a effec- 
tive diagnostic tool for detection of HCV infection. Five additional samples (G150.G151, G1 10, G125, and G124) were 
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detected with these ELISA's compared with C100 test alone. 
C. pET360 

s [0214] The results in Table 19 demonstrate the effectiveness of the pET360 to detect HCV antibodies present in 
human sera. The samples were tested for the presence of HCV by ELISA using each individual antigen shown, or with 
409-1-1 (c-a) and pET360 antigens combined in one well. 



Table 19 





C100 


5-1-1 


409-1-1 (c-a) 


C33u 


pET360 


Combined 409-1-1 (c-a) 
+ pET360 


A 


+ 




+ 


+ 




+ 


B 


+ 


+ 


+ 


+ 




+ 




+ 






+ 




+ 


D 


+ 


+ 


+ 




+ 


+ 


E 


+ 




w+ 


+ 


+ 


+ 


F 






- 


- 




- 


G 


+ 




w+ 


+ 


+ 


+ 


H 
I 














J 














K 








+ 


+ 


+ 


L 














M 














N 




w+ 




+ 


+ 


+ 


O 


+ 


w+ 


+ 


+ 


+ 


+ 


P 


+ 


w+ 


+ 


+ 


+ 


+ 


Q 














R 














S 















[021 5] These results indicate that the combination of antigen 409-1 -1 (c-a) and pET360 result in a effective diagnos- 
tic tool for detection of HCV infection. Three additional samples were detected with these ELISA's compared with C100 
test alone. 

45 [021 6] Although the invention has been described with reference to particular embodiments, methods, construction 
and use, it will be apparent to those skilled in the art that various changes and modifications can be made without 
departing from the invention. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Gezrelabs Technologies, Inc. 
(ii) TITLE OF INVENTION: Hepatitis C Virus Epitopes 
(iii) NUMBER OF SEQUENCES: 26 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: R G C JENKINS & CO. 

(B) STREET: 26 Caxton Street 
<C) CITY: London 

(D) STATE: 

(E) COUNTRY: United Kingdom 

(F) ZIP: SW1K ORJ 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: Windows 98 

(D) SOFTWARE: Patent In Release #X.O, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 99204321.6 

(B) FILING DATE: 05-Apr-1991 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 91908451.7 

(B) FILING DATE: 05-Apr-1991 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US91/02370 

(B) FILING DATE: 05-Apr-1991 

<vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/505,611 

(B) FILING DATE: 06-Apr-1990 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/594,854 

(B) FILING DATE: 09-Oct-1990 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Alan H West 

(B) REGISTRATION NUMBER: 37490 

(C) REFERENCE /DOCKET NUMBER: AHW/J. 20988 EPA 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 0171 931 7141 

(B) TELEFAX: 0171 222 4660 



(2) INFORMATION FOR SEQ ID NO:l: 
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(i> SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 561 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 

(B) STRAIN: CDC 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 304-12-1 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..561 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GAA TTC CTC GTG CAA GCG TGG AAG TCC AAG AAA ACC CCA ATG GGG TTC 
48 

Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe 
15 10 15 



TCG TAT GAT ACC CGC TGC TTT GAC TCC ACA GTC ACT GAG AGC GAC ATC 
96 

Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp lie 
20 25 30 



CGT ACG GAG GAG GCA ATC TAC CAA TGT TGT GAC CTC GAC CCC CAA GCC 
144 

Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala 
35 40 45 



CGC GTG GCC ATC AAG TCC CTC ACC GAG AGG CTT TAT GTT GGG GGC CCT 
192 

Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro 
50 55 60 



CTT ACC AAT TCA AGG GGG GAG AAC TGC GGC TAT CGC AGG TGC CGC GCG 
240 

Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala 
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10 



65 70 75 80 

AGC GGC GTA CTG ACA ACT AGO TGT GGT AAC ACC CTC ACT TGC TAC ATC 
288 

Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie 
85 90 95 

AAG GCC CGG GCA GCC TGT CGA GCC GCA GGG CTC CAG GAC TGC ACC ATG 
336 

Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met 
rs 100 105 110 

CTC GTG TGT GGC GAC GAC TTA GTC GTT ATC TGT GAA AGC GCG GGG GTC 
384 

Leu Val Cys Gly Asp Asp Leu Val Val lie Cys Glu Ser Ala Gly Val 

20 

115 120 125 

CAG GAG GAC GCG GCG AGC CTG AGA GCC TTC ACG GAG GCT ATG ACC AGG 
432 

Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg 
130 135 140 

TAC TCC GCC CCC CCC GGG GAC CCC CCA CAA CCA GAA TAC GAC TTG GAG 
480 

Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu 
145 150 155 160 

CTC ATA ACA TCA TGC TCC TCC AAC GTG TCA GTC GCC CAC GAC GGC GCT 
528 

Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala 
165 170 175 

GGA AAG AGG GTC TAC TAC CTC ACC CGG GAA TTC 

561 

Gly Lys Arg Val Tyr Tyr Leu Thr Arg Glu Phe 
180 165 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
so (A) LENGTH: 187 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 

55 



25 



30 



35 



40 



45 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe 
IS 10 X5 



Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp lie 
20 25 30 



Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala 
35 40 45 



Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro 
50 55 €0 



Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala 
65 70 75 80 

Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr He 
85 90 95 



Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met 
100 105 110 



Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly Val 
115 X20 125 



Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg 
130 135 140 



Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu 
145 150 155 160 

Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala 
165 170 175 



Gly Lys Arg Val Tyr Tyr Leu Thr Arg Glu Phe 
180 185 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO • 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis HCV Virus 

(B) STRAIN: CDC 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 303-1-4 

( ix) FEATURE : 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..252 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AAC TCC GTG TGG AAA GAC CTT CTG GAA GAC AAT GTA ACA CCA ATA GAC 
48 

Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr Pro lie Asp 
1 5 10 15 



ACT ACC ATC ATG GCT AAG AAC GAG GTT TTC TGC GTT CAG CCT GAG AAG 
96 

Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys 
20 25 30 



GGG GGT CGT AAG CCA GCT CGT CTC ATC GTG TTC CCC GAT CTG GGC GTG 
144 

Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu Gly Val 
35 40 45 



CGC GTG TGC GAA AAG ATG GCT TTG TAC GAC GTG GTT ACC AAG CTC CCC 
192 

Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr Lys Leu Pro 
50 55 60 



TTG GCC GTG ATG* GGA AGC TCC TAC GGA TTC CAA TAC TCA CCA GGA CAG 
240 

Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin 
65 70 75 80 



CGG GTT GAA TTC 
252 

Arg Val Glu Phe 
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(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr Pro He Asp 
15 10 15 



Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys 
20 25 30 



Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly Val 
35 40 45 



Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr Lys Leu Pro 
50 55 60 



Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin 
65 70 75 80 

Arg Val Glu Phe 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1512 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 303-1-4 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1512 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO;5: 

GAA TTC TTC ACA GAA TTG GAC GGG GTG CGC CTA CAT AGG TTT GCG CCC 
48 

Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro 
15 10 15 

CCC TGC AAG CCC TTG CTG CGG GAG GAG GTA TCA TTC AGA GTA GGA CTC 
96 

Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu 
20 25 30 

CAC GAA TAC CCG GTA GGG TCG CAA TTA CCT TGC GAG CCC GAA CCG GAT 
144 

His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp 
35 40 45 

GTG GCC GTG TTG ACG TCC ATG CTC ACT GAT CCC TCC CAT ATA ACA GCA 
192 

Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala 
50 55 60 

GAG GCG GCC GGG CGA AGG TTG GCG AGG GGA TCA CCC CCC TCT GTG GCC 
240 

Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala 
65 70 75 80 

AGC TCC TCG GCT AGC CAG CTA TCC GCT CCA TCT CTC AAG GCA ACT TGC 
288 

Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys 
85 90 95 

ACC GCT AAC CAT GAC TCC CCT GAT GCT GAG CTC ATA GAG GCC AAC CTC 
336 

Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu 
100 105 110 

CTA TGG AGG CAG GAG ATG GGC GGC AAC ATC ACC AGG GTT GAG TCA GAA 
384 

Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu 
115 120 125 
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AAC AAA GTG GTG ATT CTG GAC TCC TTC GAT- CCG CTT GTG GCG GAG GAG 
432 

Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu 
130 135 140 



GAC GAG CGG GAG ATC TCC GTA CCC GCA GAA ATC CTG CGG AAG TCT CGG 
480 

Asp Glu Arg Glu lie Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg 
145 150 155 160 



AGA TTC GCC CAG GCC CTG CCC GTT TGG GCG CGG CCG GAC TAT AAC CCC 
528 

Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro 
165 170 175 



CCG CTA GTG GAG ACG TGG AAA AAG CCC GAC TAC GAA CCA CCT GTG GTC 
576 

Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val 
180 185 190 



CAT GGC TGT CCG CTT CCA CCT CCA AAG TCC CCT CCT GTG CCT CCG CCT 
624 

His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro 
195 200 205 



CGG AAG AAG CGG ACG GTG GTC CTC ACT GAA TCA ACC CTA TCT ACT GCC 
672 

Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala 
210 215 220 



TTG GCC GAG CTC GCC ACC AGA AGC TTT GGC AGC TCC TCA ACT TCC GGC 
720 

Leu Ala Glu Leu' Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly 
225 230 235 240 



ATT ACG GGC GAC AAT ACG ACA ACA TCC TCT GAG CCC GCC CCT TCT GGC 
768 

He Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly 
245 250 255 



TGC CCC CCC GAC TCC GAC GCT GAG TCC TAT TCC TCC ATG CCC CCC CTG 
816 
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10 



15 



20 



25 



30 



35 



45 



50 



55 



Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu 
260 265 270 

GAG GGG GAG CCT GGG GAT CCG GAT CTT AGC GAC GGG TCA TGG TCA ACG 
864 

Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr 
275 280 285 

GTC AGT AGT GAG GCC AAC GCG GAG GAT GTC GTG TGC TGC TCA ATG TCT 
912 

Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser 
290 295 300 

TAC TCT TGG ACA GGC GCA CTC GTC ACC CCG TGC GCC GCG GAA GAA CAG 
960 

Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin 
305 310 315 320 

AAA CTG CCC ATC AAT GCA CTA AGC AAC TCG TTG CTA CGT CAC CAC AAT 
1008 

Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn 
325 330 335 

TTG GTG TAT TCC ACC ACC TCA CGC AGT GCT TGC CAA AGG CAG AAG AAA 
1056 

Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys 
340 345 350 



GTC ACA TTT GAC AGA CTG CAA GTT CTG GAC AGC CAT TAC CAG GAC GTA 
1104 

40 Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val 

355 360 365 



CTC AAG GAG GTT AAA GCA GCG GCG TCA AAA GTG AAG GCT AAC TTG CTA 
1152 

Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu 
370 375 380 

TCC GTA GAG GAA GCT TGC AGC CTG ACG CCC CCA CAC TCA GCC AAA TCC 
1200 

Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser 
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385 390 - 395 400 

AAG TTT GGT TAT GGG GCA AAA GAC GTC CGT TGC CAT GCC AGA AAG GCC 
1248 

Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala 
405 410 415 

GTA ACC CAC ATC AAC TCC GTG TGG AAA GAC CTT CTG GAA GAC AAT GTA 
1296 

Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val 
™ 420 425 430 

ACA CCA ATA GAC ACT ACC ATC ATG GCT AAG AAC GAG GTT TTC TGC GTT 
1344 

2Q Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val 

435 440 445 

CAG CCT GAG AAG GGG GGT CGT AAG CCA GCT CGT CTC ATC GTG TTC CCC 
1392 

Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro 
450 455 460 

30 GAT CTG GGC GTG CGC GTG TGC GAA AAG ATG GCT TTG TAC GAC GTG GTT 

1440 

Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val 



25 



35 



40 



465 470 475 460 

ACC AAG CTC CCC TTG GCC GTG ATG GGA AGC TCC TAC GGA TTC GAA TAC 
1488 

Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr 
485 490 495 

TCA CCA GGA CAG CGG GTT GAA TTC 

1512 

Ser Pro Gly Gin Arg Val Glu Phe 

45 500 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 504 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

5 (xi) SEQUENCE DESCRIPTION; SEQ ID NO:6: 

Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro 
1 5 10 15 

10 Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu 

20 25 30 



75 



20 



25 



30 



35 



40 



45 



50 



His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp 
35 40 45 

Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala 
50 55 60 

Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala 
65 70 75 80 

Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys 
85 90 95 

Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu 
100 105 110 

Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu 
115 120 125 

Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu 
130 135 140 

Asp Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg. Lys Ser Arg 
145 150 155 160 

Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro 
165 170 175 

Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val 
180 185 190 

His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro 
195 200 205 

Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala 
210 215 220 
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Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly 
225 230 235 240 

lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly 
245 250 255 

Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu 
260 265 270 

Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr 
275 280 285 



Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser 
290 295 300 

Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin 
305 310 315 320 

Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg HiB His Asn 
325 330 335 

Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys 
340 345 350 

Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val 
355 360 365 

Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu 
370 375 380 

Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser 
385 390 395 400 

Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala 
405 410 415 

Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val 
420 425 430 

Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val 
435 440 445 

Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro 
450 455 460 
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Asp Lieu Gly Val Arg Val Cys Glu Lys Met- Ala Leu Tyr Asp Val Val 
465 470 475 480 

Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr 
485 490 495 



Ser Pro Gly Gin Arg Val Glu Phe 
500 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 477 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE: NO 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 

(B) STRAIN: CDC 

(C) INDIVIDUAL ISOLATE: Rodney 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 409-1-1 (c-a) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..477 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:7; 

GAA TTC CGC ACG CCC GCC GAG ACT ACA GTT AGG CTA CGG GCG TAC ATG 
48 

Glu Phe Arg Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met 
1 5 10 15 



AAC ACT CCG GGG CTT CCC GTG TGC CAG GAC GGA ATT CCG TCC CCG TCC 
96 

Asn Thr Pro Gly Leu Pro Val Cys Gin Asp Gly lie Pro Ser Pro Ser 
20 25 30 



ACC ACC GGA GAG ATC CCT TTT TAC GGC AAG GCT ATC CCC CTC GAA GTA 
144 

Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val 
35 40 45 
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ATC AAG GGG GGG AGA CAT CTC ATC TTC TGT CAT TCA AAG AAG AAG TGC 
192 

lie Lys Gly Gly Arg His Leu lie Phe Cys His Ser Lys Lys Lys Cys 
50 55 60 



GAC GAA CTC GCC GCA AAG CTG GTC GCA TTG GGC ATC AAT GCC GTG GCC 
240 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly lie Asn Ala Val Ala 
65 70 75 80 



TAC TAC CGC GGT CTT GAC GTG TCC GTC ATC CCG ACC AGC GGC GAT GTT 
288 

Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro Thr Ser Gly Asp Val 

85 90 95 



GTC GTC GTG GCA ACC GAT GCC CTC ATG ACC GGC TAT ACC GGC GAC TTC 
336 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
100 105 110 



GAC TCG GTG ATA GAC TGC AAT ACG TGT GTC ACC CAG ACA GTC GAT TTC 
384 

Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
115 120 125 



AGC CTT GAC CCT ACC TTC ACC ATT GAG ACA ATC ACG CTC CCC CAG GAT 
432 

Ser Leu Asp Pro Thr Phe Thr lie Glu Thr lie Thr Leu Pro Glix Asp 
130 135 140 



GCT GTC TCC CGC ACT CAA CGT CGG GGC AGG ACT GGC ACG GAA TTC 
477 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Thr Glu Phe 
145 150 155 



(2) INFORMATION FOR SEQ ID NO; 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 159 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Glu Phe Arg Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met 
15 10 15 

Asn Thr Pro Gly Leu Pro Val Cys Gin Asp Gly lie Pro Ser Pro Ser 
20 25 30 



Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val 
35 40 45 

lie Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
50 55 60 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
65 70 75 80 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
85 90 95 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
100 105 110 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
115 120 125 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
130 135 140 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Thr Glu Phe 
145 150 155 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 558 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA to mRNA 
(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 

(B) STRAIN: CDC 

(vii) IMMEDIATE SOURCE: 
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(B) CLONE: 409-1-1 (abc) 

( ix) FEATURE : 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..S58 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

TCC ACC ACC GGA GAG ATC CCT TTT TAC GGC AAG GCT ATC CCC CTC GAA 
48 

Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu 
15 10 15 



GTA ATC AAG GGG GGG AGA CAT CTC ATC TTC TGT CAT TCA AAG AAG AAG 
96 

Val He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys 
20 25 30 



TGC GAC GAA CTC GCC GGA AAG CTG GTC GCA TTG GGC ATC AAT GCC GTG 
144 

Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val 
35 40 45 



GCC TAC TAC CGC GGT CTT GAC GTG TCC GTC ATC CCG ACC AGC GGC GAT 
192 

Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp 
50 55 60 



GTT GTC GTC GTG GCA ACC GAT GCC CTC ATG ACC GGC TAT ACC GGC GAC 
240 

Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp 
65 70 75 80 



TTC GAC TCG GTG" ATA GAC TGC AAT ACG TGT GTC ACC CAG AGA GTC GAT 
288 

Phe Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp 
85 90 95 



TTC AGC CTT GAC CCT ACC TTC ACC ATT GAG ACA ATC ACG CTC CCC CAG 
336 

Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr lie Thr Leu Pro Gin 
100 105 110 
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GAT GCT GTC TCC CGC ACT CAA CGT CGG GGO AGG ACT GGC AGG GGG AAG 
3 84 

Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys 
115 120 125 

CCA GGC ATC TAC AGA TTT GTG GCA CCG GGG GAG CGC CCC TCC GGC ATG 
432 

Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met 
130 135 140 

15 TTC GAC TCG TCC GTC CTC TGT GAG TGC TAT GAC GCA GGC TGT GCT TGG 

480 

Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Tip 



10 



20 



25 



30 



45 



50 



55 



145 150 155 160 



TAT GAG CTC ACG CCC GCC GAG ACT ACA GTT AGG CTA CGA GCG TAC ATG 
528 

Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met 
165 170 175 



AAC ACC CCG GGG CTT CCC GTG TGC CAG GAC 
558 

Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
180 185 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 186 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY i linear 

(ii) MOLECULE TYPE: protein 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Ser Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu 
1 5 10 15 



Val lie Lys Gly Gly Arg His Leu lie Phe Cys His Ser Lys Lys Lys 
20 25 30 

Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly lie Asn Ala Val 
35 40 45 

Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro Thr Ser Gly Asp 
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50 



55 



60 



Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp 
65 70 75 go 

Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp 
85 90 95 



Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin 
100 105 no 



Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys 
115 120 125 



Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met 
130 135 140 



Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp 
145 150 155 160 

Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met 
165 170 175 



Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
180 185 

(2) INFORMATION FOR SEQ ID NO: 11: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 657 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to tciRNA 
(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE: NO 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 

(B) STRAIN: CDC 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: GG1 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..645 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATG GGC ACG AAT CCT AAA CCT CAA AAA AAA AAC AAA CGT AAC ACC AAC 
48 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys Asn Lys Arg Asn Thr Asn 
15 10 15 

CGT CGC CCA CAG GAC GTC AAG TTC CCG GGT GGC GGT CAG ATC GTT GGT 
96 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGA TTG GGT GTG CGC GCG 
144 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

ACG AGA AAG ACT TCC GAG CGG TCG CAA CCT CGA GGT AGA CGT CAG CCT 
192 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

ATC CCC AAG GOT CGT CGG CCC GAG GGC AGG ACC TGG GCT CAG CCC GGG 
240 

lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

TAC CCT TGG CCC CTC TAT GGC AAT GAG GGC TGC GGG TGG GCG GGA TGG 
288 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

CTC CTG TCT CCC* CGT GGC TCT CGG CCT AGO TGG GGC CCC ACA GAC CCC 
336 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 

CGG CGT AGG TCG CGC AAT TTG GGT AAG GTC ATC GAT ACC CTT ACG TGC 
384 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 
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GGC TTC GCC GAC CTC ATG GGG TAG ATA CCG CTC GTC GGC GCC CCT CTT 
432 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu 
130 135 140 

GGA GGC GCT GCC AGG GCC CTG GCG CAT GGC GTC CGG GTT CTG GAA GAC 
480 

Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 



GGC GTG AAC TAT GCA ACA GGG AAC CTT CCT GGT TGC TCT TTC TCT ATC 
528 

Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

TTC CTT CTG GCC CTG CTC TCT TGC TTG ACT GTG CCC GCT TCG GCC TAC 
576 

Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr 
180 185 190 

CAA GTG CGC AAC TCC ACG GGG CTT TAC CAC GTC ACC AAT GAT TGC CCT 
624 

Gin Val Arg Asn Ser Thr Gly Leu Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 

AAC TCG AGC ATT GTG TAC GAG TAATAGGGAT CC 
657 

Asn Ser Ser lie Val Tyr Glu 
210 215 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 215 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys Asn Lys Arg Asn Thr Asn 
15 10 IS 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
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20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 

Arg Arg Ajrg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr 
180 185 190 

Gin Val Arg Asn Ser Thr Gly Leu Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 

Asn Ser Ser He Val Tyr Glu 
210 215 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 657 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

<A> ORGANISM: Hepatitis C Virus 
<B> STRAIN: CDC 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: CapA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..645 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATG GGC ACG AAT CCT AAA CCT CAG AAG AAG AAC AAA CGT AAC ACC AAC 
48 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys Asn Lys Arg Asn Thr Asn 
15 10 is 

CGT CGC CCA CAG GAC GTC AAG TTC CCG GGT GGC GGT CAG ATC GTT GGT 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGA TTG GGT GTG CGC GCG 
144 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 



ACG AGA AAG ACT TCC GAG CGG TCG CAA CCT CGA GGT AGA CGT CAG CCT 
192 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

ATC CCC AAG GCT CGT CGG CCC GAG GGC AGG ACC TGG GCT CAG CCC GGG 
240 

lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

TAC CCT TGG CCC CTC TAT GGC AAT GAG GGC TGC GGG TGG GCG GGA TGG 
288 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 
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CTC CTG TCT CCC CGT GGC TCT CGG CCT AGC TGG GGC CCC ACA GAC CCC 
336 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 no 

CGG CGT AGG TCG CGC AAT TTG GGT AAG GTC ATC GAT ACC CTT ACG TGC 
384 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

GGC TTC GCC GAC CTC ATG GGG TAC ATA CCG CTC GTC GGC GCC CCT CTT 
432 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu 
130 135 140 

GGA GGC GCT GCC AGG GCC CTG GCG CAT GGC GTC CGG GTT CTG GAA GAC 
480 

Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

GGC GTG AAC TAT GCA ACA GGG AAC CTT CCT GGT TGC TCT TTC TCT ATC 
528 

Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

TTC CTT CTG GCC CTG CTC TCT TGC TTG ACT GTG CCC GCT TCG GCC TAC 
576 

Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr 
180 185 190 

CAA GTG CGC AAC- TCC ACG GGG CTT TAC CAC GTC ACC AAT GAT TGC CCT 
624 

Gin Val Arg Asn Ser Thr Gly Leu Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 

AAC TCG AGC ATT GTG TAC GAG TAATAGGGAT CC 
657 

Asn Ser Ser lie Val Tyr Glu 
210 215 

(2) INFORMATION FOR SEQ ID NO: 14: 
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(i) SEQUENCE CHARACTERISTICS: * 

(A) LENGTH: 215 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Gly Thr Asn Pro Lys Pro Gin Lye Lys Asn Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 



Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr 
180 185 190 

Gin Val Arg Asn Ser Thr Gly Leu Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 
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Asn Ser Ser lie Val Tyr Glu 
210 215 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 453 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 

(B) STRAIN: CDC 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: C1NC450 

( ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..450 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATG GGC ACG AAT CCT AAA CCT CAG AAG AAG AAC AAA CGT AAC ACC AAC 
48 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys Asn Lys Arg Asn Thr Asn 
15 10 15 



CGT CGC CCA CAG GAC GTC AAG TTC CCG GGT GGC GGT CAG ATC GTT GGT 
96 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 



GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGA TTG GGT GTG CGC GCG 
144 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 



ACG AGA AAG ACT TCC GAG CGG TCG CAA CCT CGA GGT AGA CGT CAG CCT 
192 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 



65 



EP1 018 558 A2 



ATC CCC AAG GCT CGT CGG CCC GAG GGC AGG ACC TGG GCT CAG CCC GGG 
240 

lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thx Trp Ala Gin Pro Gly 
65 70 75 80 

TAC CCT TGG CCC CTC TAT GGC AAT GAG GGC TGC GGG TGG GCG GGA TGG 
288 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 

85 90 95 

CTC CTG TCT CCC CGT GGC TCT CGG CCT AGC TGG GGC CCC ACA GAC CCC 
336 

I^u Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 

CGG CGT AGG TCG CGC AAT TTG GGT AAG GTC ATC GAT ACC CTT ACG TGC 
384 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

GGC TTC GCC GAC CTC ATG GGG TAC ATA CCG CTC GTC GGC GCC CCT CTT 
432 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu 
130 135 140 

GGA GGC GCT GCC AGG GCC TAA 
453 

Gly Gly Ala Ala Arg Ala 
145 150 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys Asn Lys Arg Asn Thr Asn 
15 10 15 
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10 



is 



25 



30 



35 



50 



Arg Arg Pro Gin Asp Val Lys Phe Pro Gly- Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 



Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
20 100 105 110 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 



Gly Phe Ala Asp Leu Met Gly Tyr lie Pro, Leu Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Ala Ala Arg Ala 
145 150 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 360 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 4 

40 (ii) MOLECULE TYPE: cDNA to rnRNA 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

45 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 

(B) STRAIN: CDC 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: C1NC3 60 

(ix) FEATURE: 

(A) NAME/KEY: CDS 



55 



BNSDOCID:<EP 101855842 !_> 
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(B) LOCATION: 1..357 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:17: 

ATG GGC ACG AAT CCT AAA CCT CAG AAG AAG AAC AAA CGT AAC ACC AAC 
48 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys Asn Lys Arg Asn Thr Asn 
1 5 io xs 

CGT CGC CCA CAG GAC GTC AAG TTC CCG GGT GGC GGT CAG ATC GTT GGT 
96 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGA TTG GGT GTG CGC GCG 
144 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

ACG AGA AAG ACT TCC GAG CGG TCG CAA CCT CGA GGT AGA CGT CAG CCT 
192 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

ATC CCC AAG GCT CGT CGG CCC GAG GGC AGG ACC TGG GCT CAG CCC GGG 
240 

He Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

TAC CCT TGG CCC CTC TAT GGC AAT GAG GGC TGC GGG TGG GCG GGA TGG 
288 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
' 85 90 95 

CTC CTG TCT CCC CGT GGC TCT CGG CCT AGC TGG GGC CCC ACA GAC CCC 
336 

Leu Leu Ser Pro Arg Gly Ser Arg pro Ser Trp Gly Pro Thr Asp Pro 
100 105 no 

CGG CGT AGG TCG CGC AAT TTG TAA 
360 

Arg Arg Arg Ser Arg Asn Leu 
115 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys Asn Lys Arg Asn Thr Asn 
1 5 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 



Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Ala Arg Arg Pro Glu Gly Axg Thr Trp Ala Gin Pro Gly 
65 70 75 . 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 



Arg Arg Arg Ser Arg Asn Leu 
115 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 273 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Hepatitis C Virus 

(B) STRAIN: CDC 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: C1NC270 

( ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..270 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

ATG GGC ACG AAT CCT AAA CCT CAG AAG AAG AAC AAA CGT AAC ACC AAC 
48 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys Asn Lys Arg Asn Thr Asn 
1 5 10 15 

CGT CGC CCA CAG GAC GTC AAG TTC CCG GGT GGC GGT CAG ATC GTT GGT 
96 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 



GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGA TTG GGT GTG CGC GCG 
144 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 



ACG AGA AAG ACT TCC GAG CGG TCG CAA CCT CGA GGT AGA CGT CAG CCT 
192 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 



ATC CCC AAG GCT CGT CGG CCC GAG GGC AGG ACC TGG GCT CAG CCC GGG 
240 

lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 



TAC CCT TGG CCC CTC TAT GGC AAT GAG GGC TAA 
273 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 
85 90 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 90 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys Asn Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 
85 90 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 183 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 

(B) STRAIN: CDC 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: C100NC270 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .180 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
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ATG GGT GGA GTT TAC TTG TTG CCG CGC AGG- GGC CCT AGA TTG GGT GTG 
48 

Met Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val 
15 10 is 

CGC GCG ACG AGA AAG ACT TCC GAG CGG TCG CAA CCT CGA GGT AGA CGT 
96 

Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg 
20 25 30 

CAG CCT ATC CCC AAG GCT CGT CGG CCC GAG GGC AGG ACC TGG GCT CAG 
144 

Gin Pro lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin 
35 40 45 

CCC GGG TAC CCT TGG CCC CTC TAT GGC AAT GAG GGC TAA 

183 

Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 22: 

Met Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val 
15 10 15 

Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg 
20 25 30 

Gin Pro lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin 
35 40 45 

Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 270 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: NO 

<iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 

(B) STRAIN: CDC 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: C100NC360 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..267 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

ATG GGT GGA GTT TAC TTG TTG CCG CGC AGG GGC CCT AGA TTG GGT GTG 
48 

Met Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val 
15 10 15 



CGC GCG ACG AGA AAG ACT TCC GAG CGG TCG CAA CCT CGA GGT AGA CGT 
96 

Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg 
20 25 30 



CAG CCT ATC CCC AAG GCT CGT CGG CCC GAG GGC AGG ACC TGG GCT CAG 
144 

Gin Pro lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin 
35 40 45 



CCC GGG TAC CCT TGG CCC CTC TAT GGC AAT GAG GGC TGC GGG TGG GCG 
192 

Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala 
50 55 60 



GGA TGG CTC CTG TCT CCC CGT GGC TCT CGG CCT AGC TGG GGC CCC ACA 
240 

Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr 
65 70 75 80 
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GAC CCC CGG CGT AGG TCG CGC AAT TTG TAA 
270 

Asp Pro Arg Arg Arg Ser Arg Asn Leu 
85 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val 
1 5 10 15 



Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg 
20 25 30 



Gin Pro He Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin 
35 40 45 



Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala 
50 55 60 



Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr 
65 70 75 80 

Asp Pro Arg Arg Arg Ser Arg Asn Leu 
85 



12) INFORMATION FOR SEQ ID NO: 25: 

Ci) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 106 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hepatitis C Virus 

(B) STRAIN: CDC 
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w 



15 



20 



35 



40 



(vii) IMMEDIATE SOURCE: 

(B) CLONE : C1NC105 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1. .105 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

ATG GGC ACG AAT CCT AAA CCT CAG AAG AAG AAC AAA CGT AAC ACC AAC 
48 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys ,Asn Lys Arg Asn Thr Asn 
15 10 15 

CGT CGC CCA CAG GAC GTC AAG TTC CCG GGT GGC GGT CAG ATC GTT GGT 
96 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 



25 GGA GTT TTA A 

106 
Gly Val Leu 
35 



30 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Gly Thr Asn Pro Lys Pro Gin Lys Lys Asn Lys Arg Asn Thr Asn 
1 5 10 15 

45 Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 

20 25 30 



50 



Gly Val Leu 
35 



55 Claims 



A recombinant polypeptide identified by SEQ ID NO: 2 and which is specifically immunoreactive with sera from 
humans infected with hepatitis C virus (HCV). 



ENSDOC'D- <F° 101P CC 9A9 i 
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2. A polypeptide according to claim 1 . which is encoded by the polynucleotide sequence identified by SEQ ID NO: 1 . 

3. A polypeptide according to claim 1 or claim 2, which is produced by the expression vector contained in an E. coli 
host identified by ATCC No. 40893. 

4. A recombinant polypeptide identified by SEQ ID NO: 8 and which is specifically immunoreactive with sera from 
humans infected with hepatitis C virus (HCV). 

5. A polypeptide according to claim 4, which is encoded by the polynucleotide sequence identified by SEQ ID NO: 7. 

6. A polypeptide according to claim 4 or claim 5, which is produced by the expression vector contained in an E. coli 
host identified by ATCC No. 40792. 

7. A recombinant polypeptide identified by SEQ ID NO: 10 and which is specifically immunoreactive with sera from 
humans infected with hepatitis C virus (HCV). 

8. A polypeptide according to claim 7, which is encoded by the polynucleotide sequence identified by SEQ ID NO: 9. 

9. A polypeptide according to claim 7 or claim 8, which is produced by the expression vector contained in an E. coli 
host identified by ATCC No. 40876. 

10. A diagnostic kit for use in screening human blood containing antibodies which are specifically immunoreactive with 
sera from humans infected with hepatitis C virus (HCV) comprising a polypeptide antigen according to any one of 
claims 1 to 9. 

11. A kit according to claim 10, wherein the detecting means includes a solid support to which the polypeptide is 
attached, and a reporter-labeled anti-human antibody, and wherein binding of the serum antibodies to the antigen 
can be detected by binding of the reporter-labeled antibody to the solid support. 

12. A method of detecting hepatitis C virus (HCV) infection in an individual, comprising reacting serum from an HCV- 
infected test individual with a peptide antigen according to any one of claims 1 to 9. and examining the antigen for 
the presence of bound antibody. 

13. A method according to claim 12, wherein the peptide is attached to a solid support, wherein the step of reacting 
includes reacting the peptide antigen with the support and subsequently reacting the support with a reporter- 
labeled anti-human antibody, and wherein the step of examining includes detecting the presence of reporter- 
labeled anti-human antibody on the solid support. 

14. A method of producing a polypeptide which is immunoreactive with sera from humans infected with hepatitis C virus 
(HCV), comprising introducing into a suitable host an expression vector containing an open reading frame (ORF) 
having a polynucleotide sequence which encodes a polypeptide according to any one of claims 1 to 9, where the 
vector is designed to express the ORF in the host, and culturing the host under conditions resulting in the expres- 
sion of the ORF sequence. 

15. A method according to claim 14, wherein the expression vector is a lambda gt1 1 phage vector. 

16. A method according to claim 14 or claim 15, wherein the expression vector is a pGEX or pET vector. 

17. A polynucleotide which encodes a polypeptide according to any one of claims 1 to 9. 
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■ STTGEIPFYGKAIPLE 
CC ACC ACC GGA GAG ATC CCT TTT TAC GGC AAG GCT ATC CCC CTC GAA 

VIKGGRHLIFCHSKKK 
GTA ATC AAG GGG GGG AGA CAT CTC ATC TTC TGT CAT TCA AAG AAG AAG 

CDELAAKLVALGINAV 
TGC GAC GAA CTC GCC GCA AAG CTG GTC GCA TTG GGC ATC AAT GCC GTG 

AYYRGLDVSV IPTSGD 
GCC TAC TAC CGC GGT CTT GAC GTG TCC GTC ATC CCG ACC AGC GGC GAT 

V V V V A T D A L M T G Y T G D 
GTT GTC GTC GTG GCA ACC GAT GCC CTC ATG ACC GGC TAT ACC GGC GAC 

F D S V I D C N T C V T Q T V D 
TTC GAC TCG GTG ATA GAC TGC AAT ACG TGT GTC ACC CAG ACA GTC GAT 

F S L D P T F T I E T I T X. P Q 
TTC AGC CTT GAC CCT ACC TTC ACC ATT GAG ACA ATC ACG CTC CCC CAG 

D A V S RTQRrgRt'gHRGK 
GAT GCT GTC TCC CGC ACT CAA CGT CGG GGC AGG ACT GGC AGG GGG AAG 

PGIYRFVAP GERPSGM 
CCA GGC ATC TAC AGA TTT GTG GCA CCG GGG GAG CGC CCC TCC GGC ATG 

FDSSVL CE C YDAGCAW 
TTC GAC TCG TCC GTC CTC TGT GAG TGC TAT GAC GCA GGC TGT GCT TGG 

YELATPAETTVRLRAYM 
TAT GAG CTC ACG CCC GCC GAG ACT ACA GTT AGG CTA CGA GCG TAC ATG 

NTPGLPVCQD* 
AAC ACC CCG GGG CTT CCC GTG TGC CAG GAC Fig • 5 
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Epitope Comparison/Delineation 

225 2754 3024 

I jf Clone 33C 

(799bp) 



Clone 33C-U 
(529bp) 



409.1.1 (557bp) 



409.1.1 (a) (375bp) 

3129 



409.1.1 (ca) (445bp) 



409.1.1 (u) (105bp) 
409.1.1 (C) (70bp) 



409.1.1 (c+270) 
(34 0bp) 



Fig. 7 
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10 20 30 40 50 60 

!!!'<> 
1 1 I I i J 

AAA 

(wild Type) 

ATGGGCACGAATCCTAAACCTCAGAAGAAGAACAAACGTAACACCAACCGTCGCCCACAr 
aatccATGGGCACGAATCCTAAAC-> ° 

Primer CI 

METGlyThrAsnProLysProGlnLysLysAsnLysArgAsnThrAsnArgArgProGln 
70 80 go ioo 110 120 

I ■ ! i ! 

GACGTCAAGTTCCCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGG 

qaacccat_f[f|fjt ^ GAGTTTACTTGTTGCC — > 

Primer C100 

<- TCAGATCGTTGGTGGAGTTTtaataaqaatiegcrer 
Primer NCloS 

AspValLysPheProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArg 
130 "0 150 160 170 180 

: i i i : 

GGCCCTAGATTGGGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGT 
GlyProArgLeuGlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGly 
190 200 210 220 230 240 



AGACGTCAGCCTATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGG 
ArgArgGlnProIleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGly 
250 260 270 280 290 300 

TACCCTTGGCCCCTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCC 

aaacccat^nn^ TGCySGTGGG^^ G 

Primer C270-> 
< — CTCTAT GGCAATGAGGGCrra aggat ccqqcc 
<-Pruaer NC270 

TyrProTrpProLeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerPro 



Fig. 8B 



1018558A2 I > 



86 



EP 1 018 558 A2 



310 320 330 340 350 360 

I ill v I 

I i i I I i 

CGTGGCTCTCGGCCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGT 

aaacccaTGGGT 
(C360->) 

< — GGCGTAGGTCGCGCAATTTGtaa 
<-Primer NC360 

aaatccgqcc 

ArgG 1 yS er Ar g Pros e rTrpG 1 y Pr oThr AspPro Ar g ArgArgS er Arg AsnLeuGly 

370 380 390 400 410 420 

t I l I 1 I 

l I i I l i 

AAGGTCATCGATACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTC 

ft *nr:TrATCGATACC 

Primer C3 60-> 

LysValIleAspThrLeuThrCysGlyPheAl2LAspLe\oMETGlyTyrIleProLe\iVal 

430 440 450 460 470 480 

i i i i I i 

i I i i I I 

GGCGCCCCTCTTGGAGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAAGAC 
< — GG AGGCGCTGCCAGGGCCt a a aaa tccQQCC 
<-- Pruner NC450 

GlyAlaProLeuGlyGlyAlaAlaArgAlaLenAlaHisGlyValArgValLeuGlioAsp 

490 500 510 520 530 540 

i (till 
i.i i i i i 

GGCGTGAACTATGCAACAGGGAACCTTCCTGGTTGCTCTTTCT 

< — GCAACAGGGAACCTTCCTGGTTaaaaatCCaacc 
< — Primer NC520 

GlyValAsnTyrAlaThrGlyAsnLeuProGlyCysSerPheSerllePheLeulreuAla 

550 560 570 580 590 600 

1 ! 1 1 ! ! 

i i i i t i 

CTGCTCTCTTGCTTGACTGTGCCCGCTTCGGCCTACCAAGTGCGCAACTCCACGGGGCTT 
< — CTGTGCCCGCTTCGGCCTAaqgatceaacc 
< — Primer NC580 

LeuLeuSerCysLeuThrValProAlaSerAlaTyrGlnValArgAsnSerThrGlyLeu 

610 620 630 640 650 

TACCACGTCACCAATGATTGCCCTAACTCGAGCATTGTGTACGAGTAATiGGGATCC 

< — GTACGAGTAATAGGGATCCaaa 
< — Primer NC660 
(3 • primer) 

TyrHisValThrAsnAspCysProAsnSerSerlleValTyrGlu GlySer 

Fig. 8B (con't) 
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CI 



NC NC NC NC NC 

270 360 450 520 580 

^ ^ ^ ^ J\ ^ 
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200*- 3' 

NC660 



Fig. 9 



GGI 



C1NC4 5 0 
C1NC3 60 
C1NC27 0 
C1NC105 

C100NC27 0 
C100NC360 
C270NC360 
C270NC450 



Fig. 10 
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